Commit Graph

321 Commits

Author SHA1 Message Date
Kostya Serebryany
6c2479bee4 Fix buffer overflow in Lexer
Summary:
Fix PR22407, where the Lexer overflows the buffer when parsing
 #include<\
(end of file after slash)

Test Plan:
Added a test that will trigger in asan build.
This case is also covered by the clang-fuzzer bot.

Reviewers: rnk

Reviewed By: rnk

Subscribers: cfe-commits

Differential Revision: http://reviews.llvm.org/D9489

llvm-svn: 236466
2015-05-04 22:30:29 +00:00
Benjamin Kramer
f04f98d543 Use delegating ctors to reduce code duplication. NFC.
llvm-svn: 231476
2015-03-06 14:15:57 +00:00
David Majnemer
5a54977ea8 Lex: Don't crash if both conflict markers are on the same line
We would check if the terminator marker is on a newline.  However, the
logic would end up out-of-bounds if the terminator marker immediately
follows the start marker.

This fixes PR21820.

llvm-svn: 224210
2014-12-14 04:53:11 +00:00
Richard Smith
3e3a705062 [c++1z] Support for u8 character literals.
llvm-svn: 221576
2014-11-08 06:08:42 +00:00
Jay Foad
6af95d3864 Fix warning in Altivec code when building with GCC 4.8.2 on Ubuntu 14.04.
llvm-svn: 220855
2014-10-29 14:42:12 +00:00
Aaron Ballman
dd69ef38db C++1y is now C++14!
Changes diagnostic options, language standard options, diagnostic identifiers, diagnostic wording to use c++14 instead of c++1y. It also modifies related test cases to use the updated diagnostic wording.

llvm-svn: 215982
2014-08-19 15:55:55 +00:00
Rafael Espindola
cd0b380a3c Use StringRef instead of MemoryBuffer&.
This code doesn't care where the data it is processing comes from, so a
StringRef is probably the most natural interface.

llvm-svn: 215448
2014-08-12 15:46:24 +00:00
David Blaikie
3d95d858f9 Change MemoryBuffer* to MemoryBuffer& parameter to Lexer::ComputePreamble
(dropping const from the reference as MemoryBuffer is immutable already,
so const is just redundant - and while I'd personally put const
everywhere, that's not the LLVM Way (see llvm::Type for another example
of an immutable type where "const" is omitted for brevity))

Changing the pointer argument to a reference parameter makes call sites
identical between callers with unique_ptrs or raw pointers, minimizing
the churn in a pending unique_ptr migrations.

llvm-svn: 215391
2014-08-11 22:08:06 +00:00
Alp Toker
d4a3f0e894 Hide the concept of diagnostic levels from lex, parse and sema
The compilation pipeline doesn't actually need to know about the high-level
concept of diagnostic mappings, and hiding the final computed level presents
several simplifications and other potential benefits.

The only exceptions are opportunistic checks to see whether expensive code
paths can be avoided for diagnostics that are guaranteed to be ignored at a
certain SourceLocation.

This commit formalizes that invariant by introducing and using
DiagnosticsEngine::isIgnored() in place of individual level checks throughout
lex, parse and sema.

llvm-svn: 211005
2014-06-15 23:30:39 +00:00
Alp Toker
7755aff556 Remove historical Unicode TODOs
There's no immediate demand or plan to work on these.

llvm-svn: 209090
2014-05-18 18:37:59 +00:00
Craig Topper
d2d442ca73 [C++11] Use 'nullptr'. Lex edition.
llvm-svn: 209083
2014-05-17 23:10:59 +00:00
Alp Toker
2d57cea256 Provide and use a safe Token::getRawIdentifier() accessor
llvm-svn: 209061
2014-05-17 04:53:25 +00:00
Roman Divacky
6150990d59 Revert r205436:
Extend the SSE2 comment lexing to AVX2. Only 16byte align when not on AVX2.

        This provides some 3% speedup when preprocessing gcc.c as a single file.


The patch is wrong, it always uses SSE2, and when I fix that there's no speedup
at all. I am not sure where the 3% came from previously.

--Thi lie, and those below, will be ignored--

M    Lex/Lexer.cpp

llvm-svn: 205548
2014-04-03 18:04:52 +00:00
Roman Divacky
071e830bbb Extend the SSE2 comment lexing to AVX2. Only 16byte align when not on AVX2.
This provides some 3% speedup when preprocessing gcc.c as a single file.

llvm-svn: 205436
2014-04-02 17:27:03 +00:00
Benjamin Kramer
867ea1d426 [C++11] Replace llvm::tie with std::tie.
llvm-svn: 202639
2014-03-02 13:01:17 +00:00
Richard Smith
35ddad0723 Fix a minor bug in lexing pp-numbers with digit separators: if a pp-number contains "'e+", the pp-number ends between the 'e' and the '+'.
llvm-svn: 202533
2014-02-28 20:06:02 +00:00
Richard Smith
8b7258bdb3 PR18855: Add support for UCNs and UTF-8 encoding within ud-suffixes.
llvm-svn: 201532
2014-02-17 21:52:30 +00:00
Alp Toker
bfa3934f27 Rename language option MicrosoftMode to MSVCCompat
There's been long-standing confusion over the role of these two options. This
commit makes the necessary changes to differentiate them clearly, following up
from r198936.

MicrosoftExt (aka. fms-extensions):
 Enable largely unobjectionable Microsoft language extensions to ease
 portability. This mode, also supported by gcc, is used for building software
 like FreeBSD and Linux kernel extensions that share code with Windows drivers.

MSVCCompat (aka. -fms-compatibility, formerly MicrosoftMode):
 Turn on a special mode supporting 'heinous' extensions for drop-in
 compatibility with the Microsoft Visual C++ product. Standards-compilant C and
 C++ code isn't guaranteed to work in this mode. Implies MicrosoftExt.

Note that full -fms-compatibility mode is currently enabled by default on the
Windows target, which may need tuning to serve as a reasonable default.

See cfe-commits for the full discourse, thread 'r198497 - Move MS predefined
type_info out of InitializePredefinedMacros'

No change in behaviour.

llvm-svn: 199209
2014-01-14 12:51:41 +00:00
Chandler Carruth
5553d0d4ca Sort all the #include lines with LLVM's utils/sort_includes.py which
encodes the canonical rules for LLVM's style. I noticed this had drifted
quite a bit when cleaning up LLVM, so wanted to clean up Clang as well.

llvm-svn: 198686
2014-01-07 11:51:46 +00:00
Alp Toker
6de6da603e Lexer: Issue -Wbackslash-newline-escape for line comments
The warning for backslash and newline separated by whitespace was missed in
this code path.

backslash<whitespace><newline> is handled differently from compiler to compiler
so it's important to warn consistently where there's ambiguity.

Matches similar handling of block comments and non-comment lines.

llvm-svn: 197331
2013-12-14 23:32:31 +00:00
Alp Toker
08c2500f9c Fix raw lex crash and -frewrite-includes noeol-at-eof failure
Raw lexers don't have a preprocessor so we need to null check.

llvm-svn: 197245
2013-12-13 17:04:55 +00:00
Justin Bogner
5353513058 Lex: Don't restrict legal UCNs when preprocessing assembly
The C and C++ standards disallow using universal character names to
refer to some characters, such as basic ascii and control characters,
so we reject these sequences in the lexer. However, when the
preprocessor isn't being used on C or C++, it doesn't make sense to
apply these restrictions.

Notably, accepting these characters avoids issues with unicode escapes
when GHC uses the compiler as a preprocessor on haskell sources.

Fixes rdar://problem/14742289

llvm-svn: 193067
2013-10-21 05:02:28 +00:00
Richard Smith
7f2707a7f4 Per updates to D3781, allow underscore under ' in a pp-number, and allow ' in a #line directive.
llvm-svn: 191443
2013-09-26 18:13:20 +00:00
Richard Smith
fde9485297 Implement C++1y digit separator proposal (' as a digit separator). This is not
yet approved by full committee, but was unanimously supported by EWG.

llvm-svn: 191417
2013-09-26 03:33:06 +00:00
Richard Smith
5acb759f39 Avoid a signed/unsigned comparison warning with compilers that don't know how
to handle constant expressions.

llvm-svn: 191336
2013-09-24 22:13:21 +00:00
Richard Smith
2a98862be2 Handle standard libraries that miss out the space when defining the standard
literal operators. Also, for now, allow the proposed C++1y "il", "i", and "if"
suffixes too. (Will revert the latter if LWG decides not to go ahead with that
change after all.)

llvm-svn: 191274
2013-09-24 04:06:10 +00:00
Eli Friedman
29749d2e3b Fix use-after-free in r190980.
llvm-svn: 190984
2013-09-19 01:51:23 +00:00
Eli Friedman
0834a4b901 Make Preprocessor::Lex non-recursive.
Before this patch, Lex() would recurse whenever the current lexer changed (e.g.
upon entry into a macro). This patch turns the recursion into a loop: the
various lex routines now don't return a token when the current lexer changes,
and at the top level Preprocessor::Lex() now loops until it finds a token.
Normally, the recursion wouldn't end up being very deep, but the recursion depth
can explode in edge cases like a bunch of consecutive macros which expand to
nothing (like in the testcase test/Preprocessor/macro_expand_empty.c in this
patch).

<rdar://problem/14569770>

llvm-svn: 190980
2013-09-19 00:41:32 +00:00
Alexander Kornienko
37d6b18633 Use new UnicodeCharSet interface.
Summary: This is a Clang part of http://llvm-reviews.chandlerc.com/D1534

Reviewers: jordan_rose, klimek, rsmith

Reviewed By: rsmith

CC: cfe-commits

Differential Revision: http://llvm-reviews.chandlerc.com/D1535

llvm-svn: 189583
2013-08-29 12:12:31 +00:00
Eli Friedman
cefc7eafc6 Fix "//" comments with -traditional-cpp in C++.
Apparently, gcc's -traditional-cpp behaves slightly differently in C++ mode;
specifically, it discards "//" comments.  Match gcc's behavior.

<rdar://problem/14808126>

llvm-svn: 189515
2013-08-28 20:53:32 +00:00
Jordan Rose
4c55d45b13 Respect -Wnewline-eof even in C++11 mode.
If the user has requested this warning, we should emit it, even if it's not
an extension in the current language mode. However, being an extension is
more important, so prefer the pedantic warning or the pedantic-compatibility
warning if those are enabled.

<rdar://problem/12922063>

llvm-svn: 189110
2013-08-23 15:42:01 +00:00
Fariborz Jahanian
d38ad47cfa ObjectiveC migrator: More work towards
insertion of ObjC audit pragmas.

llvm-svn: 188733
2013-08-20 00:07:23 +00:00
Richard Smith
f4198b7598 C++1y literal suffix support:
* Allow ns, us, ms, s, min, h as numeric ud-suffixes
 * Allow s as string ud-suffix

llvm-svn: 186933
2013-07-23 08:14:48 +00:00
Michael J. Spencer
8c39840087 Replace Count{Leading,Trailing}Zeros_{32,64} with count{Leading,Trailing}Zeros.
llvm-svn: 182675
2013-05-24 21:42:04 +00:00
Argyrios Kyrtzidis
dc9fdaf217 [modules] If we hit a failure while loading a PCH/module, abort parsing instead of trying to continue in an invalid state.
Also don't let libclang create a PCH with such an error.

Fixes rdar://13953768

llvm-svn: 182629
2013-05-24 05:44:08 +00:00
Argyrios Kyrtzidis
065d720c31 [Lexer] Improve Lexer::getSourceText() when the given range deals with function macro arguments.
This is a modified version of a patch by Manuel Klimek.

llvm-svn: 182055
2013-05-16 21:37:39 +00:00
Richard Smith
0f7f6f1abc Typo and misc comment fix.
llvm-svn: 181583
2013-05-10 02:36:35 +00:00
Argyrios Kyrtzidis
0903f8dac5 [libclang] Make sure the preable does not truncate comments.
rdar://13647445

llvm-svn: 179907
2013-04-19 23:24:25 +00:00
Richard Smith
06d274fdb7 Add -Wc99-compat warning for C11 unicode string and character literals.
llvm-svn: 176817
2013-03-11 18:01:42 +00:00
Richard Smith
9b36209e31 When lexing in C11 mode, accept unicode character and string literals, per C11
6.4.4.4/1 and 6.4.5/1.

llvm-svn: 176780
2013-03-09 23:56:02 +00:00
Jordan Rose
864b810739 Preprocessor: don't consider // to be a line comment in -E -std=c89 mode.
It's beneficial when compiling to treat // as the start of a line
comment even in -std=c89 mode, since it's not valid C code (with a few
rare exceptions) and is usually intended as such. We emit a pedantic
warning and then continue on as if line comments were enabled.
This has been our behavior for quite some time.

However, people use the preprocessor for things besides C source files.
In today's prompting example, the input contains (unquoted) URLs, which
contain // but should still be preserved.

This change instructs the lexer to treat // as a plain token if Clang is
in C90 mode and generating preprocessed output rather than actually compiling.

<rdar://problem/13338743>

llvm-svn: 176526
2013-03-05 22:51:04 +00:00
Jordan Rose
cb8a1aca35 Preprocessor: preserve whitespace in -traditional-cpp mode.
Note that unlike GNU cpp we currently do not preserve whitespace in macros
(even in -traditional-cpp mode).

<rdar://problem/12897179>

llvm-svn: 175778
2013-02-21 18:53:19 +00:00
Jordan Rose
58c61e006f Properly validate UCNs for C99 and C++03 (both more restrictive than C(++)11).
Add warnings under -Wc++11-compat, -Wc++98-compat, and -Wc99-compat when a
particular UCN is incompatible with a different standard, and -Wunicode when
a UCN refers to a surrogate character in C++03.

llvm-svn: 174788
2013-02-09 01:10:25 +00:00
Jordan Rose
a2100d755a Pull Lexer's CharInfo table out for general use throughout Clang.
Rewriting the same predicates over and over again is bad for code size and
code maintainence. Using the functions in <ctype.h> is generally unsafe
unless they are specified to be locale-independent (i.e. only isdigit and
isxdigit).

The next commit will try to clean up uses of <ctype.h> functions within Clang.

llvm-svn: 174765
2013-02-08 22:30:22 +00:00
Jordan Rose
cc538345be Lexer: Don't warn about Unicode in preprocessor directives.
This allows people to use Unicode in their #pragma mark and in macros
that exist only to be string-ized.

<rdar://problem/13107323&13121362>

llvm-svn: 174081
2013-01-31 19:48:48 +00:00
Jordan Rose
f649795f84 Fix r173881 to properly skip invalid UTF-8 characters in raw lexing and -E.
This caused hangs as we processed the same invalid byte over and over.

<rdar://problem/13115651>

llvm-svn: 173959
2013-01-30 19:21:12 +00:00
Dmitri Gribenko
9feeef40f5 Move UTF conversion routines from clang/lib/Basic to llvm/lib/Support
This is required to use them in TableGen.

llvm-svn: 173924
2013-01-30 12:06:08 +00:00
Jordan Rose
17441589c3 Don't warn about Unicode characters in -E mode.
People use the C preprocessor for things other than C files. Some of them
have Unicode characters. We shouldn't warn about Unicode characters
appearing outside of identifiers in this case.

There's not currently a way for the preprocessor to tell if it's in -E mode,
so I added a new flag, derived from the PreprocessorOutputOptions. This is
only used by the Unicode warnings for now, but could conceivably be used by
other warnings or even behavioral differences later.

<rdar://problem/13107323>

llvm-svn: 173881
2013-01-30 01:52:57 +00:00
Jordan Rose
cccbdbf0db PR15067 (again): Don't warn about UCNs in C90 if we're raw-lexing.
Fixes a crash. Thanks, Richard.

llvm-svn: 173701
2013-01-28 17:49:02 +00:00
Jordan Rose
c0cba27230 PR15067: Don't assert when a UCN appears in a C90 file.
Unfortunately, we can't accept the UCN as an extension because we're
required to treat it as two tokens for preprocessing purposes.

llvm-svn: 173622
2013-01-27 20:12:04 +00:00
NAKAMURA Takumi
e8f83dbbd8 Lexer.cpp: Fix a warning with ptrdiff_t on i686. [-Wsign-compare]
llvm-svn: 173447
2013-01-25 14:57:21 +00:00
Jordan Rose
8b4af2ae88 Clarify comment: "diagnose" is better than "warn" when emitting an error.
Thanks, Dmitri.

llvm-svn: 173400
2013-01-25 00:20:28 +00:00
Jordan Rose
62db5066e9 Add a fixit for \U1234 -> \u1234.
llvm-svn: 173371
2013-01-24 20:50:52 +00:00
Jordan Rose
4246ae0089 As an extension, treat Unicode whitespace characters as whitespace.
llvm-svn: 173370
2013-01-24 20:50:50 +00:00
Jordan Rose
7f43dddae0 Handle universal character names and Unicode characters outside of literals.
This is a missing piece for C99 conformance.

This patch handles UCNs by adding a '\\' case to LexTokenInternal and
LexIdentifier -- if we see a backslash, we tentatively try to read in a UCN.
If the UCN is not syntactically well-formed, we fall back to the old
treatment: a backslash followed by an identifier beginning with 'u' (or 'U').

Because the spelling of an identifier with UCNs still has the UCN in it, we
need to convert that to UTF-8 in Preprocessor::LookUpIdentifierInfo.

Of course, valid code that does *not* use UCNs will see only a very minimal
performance hit (checks after each identifier for non-ASCII characters,
checks when converting raw_identifiers to identifiers that they do not
contain UCNs, and checks when getting the spelling of an identifier that it
does not contain a UCN).

This patch also adds basic support for actual UTF-8 in the source. This is
treated almost exactly the same as UCNs except that we consider stray
Unicode characters to be mistakes and offer a fixit to remove them.

llvm-svn: 173369
2013-01-24 20:50:46 +00:00
Dmitri Gribenko
f857950d39 Remove useless 'llvm::' qualifier from names like StringRef and others that are
brought into 'clang' namespace by clang/Basic/LLVM.h

llvm-svn: 172323
2013-01-12 19:30:44 +00:00
Argyrios Kyrtzidis
86f1a935dc Pull the bulk of Lexer::MeasureTokenLength() out into a new function,
Lexer::getRawToken().

No functionality change.

llvm-svn: 171771
2013-01-07 19:16:18 +00:00
Richard Smith
2bf7fdb723 s/CPlusPlus0x/CPlusPlus11/g
llvm-svn: 171367
2013-01-02 11:42:31 +00:00
Chandler Carruth
3a02247dc9 Sort all of Clang's files under 'lib', and fix up the broken headers
uncovered.

This required manually correcting all of the incorrect main-module
headers I could find, and running the new llvm/utils/sort_includes.py
script over the files.

I also manually added quite a few missing headers that were uncovered by
shuffling the order or moving headers up to be main-module-headers.

llvm-svn: 169237
2012-12-04 09:13:33 +00:00
Richard Smith
9a67f47882 Teach Lexer::getSpelling about raw string literals. Specifically, if a raw
string literal needs cleaning (because it contains line-splicing in the
encoding prefix or in the ud-suffix), do not clean the section between the
double-quotes -- that's the "raw" bit!

llvm-svn: 168776
2012-11-28 07:29:00 +00:00
Nico Weber
4e270380c1 Fix crash on end-of-file after \ in a char literal, fixes PR14369.
This makes LexCharConstant() look more like LexStringLiteral(), which doesn't
have this bug. Add tests for eof after \ for several other cases.

llvm-svn: 168269
2012-11-17 20:25:54 +00:00
Eli Friedman
b699e619fe Fix an assertion failure printing the unused-label fixit in files using CRLF line endings. <rdar://problem/12639047>.
llvm-svn: 167900
2012-11-14 01:28:38 +00:00
Daniel Dunbar
cf3f2c49ea Revert r167801, "[preprocessor] When #including something that contributes no
tokens at all,". This change broke External/Nurbs in LLVM test-suite.

llvm-svn: 167858
2012-11-13 19:12:37 +00:00
Nico Weber
7cc28804e2 UCNs in char literals are done (in LiteralSupport), remove FIXME. Expand UCN FIXME in LexNumericConstant.
llvm-svn: 167818
2012-11-13 06:25:15 +00:00
Argyrios Kyrtzidis
4f10a3e9f0 [preprocessor] When #including something that contributes no tokens at all,
don't recursively continue lexing.

This avoids a stack overflow with a sequence of many empty #includes.
rdar://11988695

llvm-svn: 167801
2012-11-13 01:03:15 +00:00
Argyrios Kyrtzidis
36675b75fb In Lexer::LexTokenInternal, avoid code duplication; no functionality change.
llvm-svn: 167800
2012-11-13 01:02:40 +00:00
Nico Weber
158a31abe2 s/BCPLComment/LineComment/
llvm-svn: 167690
2012-11-11 07:02:14 +00:00
Argyrios Kyrtzidis
d53d0daab9 Take into account that there may be a BOM at the beginning of the file,
when computing the size of the precompiled preamble.

llvm-svn: 166659
2012-10-25 01:51:45 +00:00
Dmitri Gribenko
b8e9e7507e StringRef'ize Preprocessor::CreateString().
llvm-svn: 164555
2012-09-24 21:07:17 +00:00
Roman Divacky
e637711ae0 Dont cast away const needlessly. Found by gcc48 -Wcast-qual.
llvm-svn: 163325
2012-09-06 15:59:27 +00:00
Eli Friedman
324adad966 Make a bunch of methods on Lexer private.
llvm-svn: 162970
2012-08-31 02:29:37 +00:00
Dmitri Gribenko
4aa05c571e Lexer: remove dead stores. Found by Clang static analyzer!
llvm-svn: 160973
2012-07-30 17:59:40 +00:00
Richard Smith
608c0b65d7 Add warning flag -Winvalid-pp-token for preprocessing-tokens which have
undefined behaviour, and move the diagnostic for '' from an Error into
an ExtWarn in this group. This is important for some users of the preprocessor,
and is necessary for gcc compatibility.

llvm-svn: 159335
2012-06-28 07:51:56 +00:00
James Dennett
f442d2455b Documentation cleanup:
* Removed docs for Lexer::makeFileCharRange from Lexer.cpp, as they're in
  the header file;
* Reworked the documentation for SkipBlockComment so that it doesn't confuse
  Doxygen's comment parsing;
* Added another summary with \brief markup.

llvm-svn: 158618
2012-06-17 03:40:43 +00:00
Jordan Rose
127f6eef7e [-E] Emit a rewritten _Pragma on its own line.
1. Teach Lexer that pragma lexers are like macro expansions at EOF.
2. Treat pragmas like #define/#undef when printing.
3. If we just printed a directive, add a newline before any more tokens.
(4. Miscellaneous cleanup in PrintPreprocessedOutput.cpp)

PR10594 and <rdar://problem/11562490> (two separate related problems)

llvm-svn: 158571
2012-06-15 23:33:51 +00:00
James Dennett
ff3c995624 Documentation cleanup: escape backslashes in Doxygen comments.
llvm-svn: 158552
2012-06-15 21:36:54 +00:00
Richard Smith
e6799ddae8 PR12717: Clang supports hexadecimal floating-point literals in all language
modes. For languages other than C99/C11, this isn't quite a conforming
extension, and for C++11, it breaks some reasonable code containing
user-defined literals.

In languages which don't officially have hexfloats, pare back this extension
to only apply in cases where the token starts 0x and does not contain an
underscore. The extension is still not quite conforming, but it's a lot closer
now.

llvm-svn: 158487
2012-06-15 05:07:49 +00:00
David Blaikie
2af2b3071d Fix PR13065.
This condition (added in r158093) was overly conservative.

llvm-svn: 158483
2012-06-15 00:47:13 +00:00
Dmitri Gribenko
702b732d6f Correct method name in comment: from LexRawToken to LexFromRawLexer, according
to a change done long ago in r57393.

llvm-svn: 158243
2012-06-08 23:19:37 +00:00
Jordan Rose
288c421b3d Insert a space if necessary when suggesting CFBridgingRetain/Release.
This was a problem for people who write 'return(result);'

Also fix ARCMT's corresponding code, though there's no test case for this
because implicit casts like this are rejected by the migrator for being
ambiguous, and explicit casts have no problem.

<rdar://problem/11577346>

llvm-svn: 158130
2012-06-07 01:10:31 +00:00
David Blaikie
d5321247c4 Add a -rewrite-includes option, which is similar to -rewrite-macros, but only expands #include directives.
Patch contributed by Lubos Lunak (l.lunax@suse.cz).
Review by Matt Beaumont-Gay (matthewbg@google.com).

llvm-svn: 158093
2012-06-06 18:52:13 +00:00
David Blaikie
987bcf9462 Escape \n and \r in doxycomment.
llvm-svn: 158091
2012-06-06 18:43:20 +00:00
Benjamin Kramer
e5fbc6c85d Lexer::ReadToEndOfLine: Only build the string if it's actually used and do so in a less malloc-intensive way.
llvm-svn: 157064
2012-05-18 19:32:16 +00:00
Seth Cantrell
e83c731cad Support -Wc++98-compat-pedantic as requested:
http://lists.cs.uiuc.edu/pipermail/cfe-commits/Week-of-Mon-20120409/056126.html

llvm-svn: 154655
2012-04-13 03:43:23 +00:00
Seth Cantrell
10ac7205ce C++11 no longer requires files to end with a newline
llvm-svn: 154643
2012-04-13 01:00:34 +00:00
Francois Pichet
7ebc4c1910 ext_reserved_user_defined_literal must not default to Error in MicrosoftMode. Hence create ext_ms_reserved_user_defined_literal that doesn't default to Error; otherwise MSVC headers won't parse.
Fixes PR12383.

llvm-svn: 154273
2012-04-07 23:09:23 +00:00
David Blaikie
bbafb8a745 Unify naming of LangOptions variable/get function across the Clang stack (Lex to AST).
The member variable is always "LangOpts" and the member function is always "getLangOpts".

Reviewed by Chris Lattner

llvm-svn: 152536
2012-03-11 07:00:24 +00:00
Richard Smith
0df56f4a90 Implement C++11 [lex.ext]p10 for string and character literals: a ud-suffix not
starting with an underscore is ill-formed.

Since this rule rejects programs that were using <inttypes.h>'s macros, recover
from this error by treating the ud-suffix as a separate preprocessing-token,
with a DefaultError ExtWarn. The approach of treating such cases as two tokens
is under discussion for standardization, but is in any case a conforming
extension and allows existing codebases to keep building while the committee
makes up its mind.

Reword the warning on the definition of literal operators not starting with
underscores (which are, strangely, legal) to more explicitly state that such
operators can't be called by literals. Remove the special-case diagnostic for
hexfloats, since it was both triggering in the wrong cases and incorrect.

llvm-svn: 152287
2012-03-08 02:39:21 +00:00
Richard Smith
3e4a60a2cd Add -Wc++11-compat warning for string and character literals followed by
identifiers, in cases where those identifiers would be treated as
user-defined literal suffixes in C++11.

llvm-svn: 152198
2012-03-07 03:13:00 +00:00
Richard Smith
d67aea28f6 User-defined literals: reject string and character UDLs in all places where the
grammar requires a string-literal and not a user-defined-string-literal. The
two constructs are still represented by the same TokenKind, in order to prevent
a combinatorial explosion of different kinds of token. A flag on Token tracks
whether a ud-suffix is present, in order to prevent clients from needing to look
at the token's spelling.

llvm-svn: 152098
2012-03-06 03:21:47 +00:00
Richard Smith
e18f0faff2 Lexing support for user-defined literals. Currently these lex as the same token
kinds as the underlying string literals, and we silently drop the ud-suffix;
those issues will be fixed by subsequent patches.

llvm-svn: 152012
2012-03-05 04:02:15 +00:00
Argyrios Kyrtzidis
0d9e24b1db Change Lexer::makeFileCharRange() to have it accept a CharSourceRange
instead of a SourceRange, and handle the case where the range is
a char (not token) range.

llvm-svn: 149677
2012-02-03 05:58:29 +00:00
Argyrios Kyrtzidis
abff5f1271 Improve Lexer::getImmediateMacroName to take into account inner macros
of macro arguments.

For "MAC1( MAC2(foo) )" and location of 'foo' token it would return
"MAC1" instead of "MAC2".

llvm-svn: 148704
2012-01-23 16:58:33 +00:00
Argyrios Kyrtzidis
85e7671b71 Enhance Lexer::makeFileCharRange to check for ranges inside a macro argument
expansion, in which case it returns a file range in the location where the
argument was spelled.

llvm-svn: 148551
2012-01-20 16:52:43 +00:00
Argyrios Kyrtzidis
7838a2bffb Introduce Lexer::getSourceText() that returns a string for the source
that the given source range encompasses.

llvm-svn: 148481
2012-01-19 15:59:19 +00:00
Argyrios Kyrtzidis
a99e02d019 Introduce Lexer::makeFileCharRange() that accepts a token source range
and returns a character range with file locations.

llvm-svn: 148480
2012-01-19 15:59:14 +00:00
Argyrios Kyrtzidis
1b07c344b4 For Lexer's isAt[Start/End]OfMacroExpansion add an out parameter for the macro
start/end location.

It is commonly needed after calling the function; with this way we avoid
recalculating it.

llvm-svn: 148479
2012-01-19 15:59:08 +00:00
Anna Zaks
1bea4bf590 Refactor: Pull getImmediateMacroName() out of DiagnosticRenderer and
into Lexer and Preprocessor; making it widely available.

llvm-svn: 148410
2012-01-18 20:17:16 +00:00
Chandler Carruth
5b15a9be6a Two variables had been added for an assert, but their values were
re-computed rather than the variables be re-used just after the assert.
Just use the variables since we have them already. Fixes an unused
variable warning.

Also fix an 80-column violation.

llvm-svn: 148212
2012-01-15 09:03:45 +00:00
Argyrios Kyrtzidis
8a26c4de64 In Lexer::getCharAndSizeSlow[NoWarn] if we come up against
\<newline><newline>

don't consume the second newline.

Thanks to David Blaikie for pointing out the crash!

llvm-svn: 147138
2011-12-22 04:38:07 +00:00