The fast scanner generator for Java™ with full Unicode support

Last update: Dec 18, 2022

Overview

JFlex

JFlex is a lexical analyzer generator (also known as scanner generator) for Java.

JFlex takes as input a specification with a set of regular expressions and corresponding actions. It generates Java source of a lexer that reads input, matches the input against the regular expressions in the spec file, and runs the corresponding action if a regular expression matched. Lexers usually are the first front-end step in compilers, matching keywords, comments, operators, etc, and generating an input token stream for parsers.

JFlex lexers are based on deterministic finite automata (DFAs). They are fast, without expensive backtracking.

Usage

For documentation and more information see the JFlex documentation and the wiki.

Usage with Maven

You need Maven 3.5.2 or later, and JDK 8 or later.

Place grammar files in src/main/flex/ directory.
Extend the project POM build section with the maven-jflex-plugin

  <build>
    <plugins>
      <plugin>
        <groupId>de.jflex</groupId>
        <artifactId>jflex-maven-plugin</artifactId>
        <version>1.8.2</version>
        <executions>
          <execution>
            <goals>
              <goal>generate</goal>
            </goals>
          </execution>
        </executions>
      </plugin>
    </plugins>
  </build>

Voilà: Java code is produced in target/generated-sources/ during the generate-sources phase (which happens before the compile phase) and included in the compilation scope.

Usage with ant

You need ant, the binary jflex jar and JDK 8 or later.

Define ant task

<taskdef classname="jflex.anttask.JFlexTask" name="jflex"
         classpath="path-to-jflex.jar"/>

Use it

<jflex file="src/grammar/parser.flex" destdir="build/generated/"/>
<javac srcdir="build/generated/" destdir="build/classes/"/>

Usage with Bazel

We provide a jflex rule

load("@jflex_rules//jflex:jflex.bzl", "jflex")

jflex(
    name = "",           # Choose a rule name
    srcs = [],           # Add input lex specifications
    outputs = [],        # List expected generated files
)

See the sample simple BUILD file.

Usage in CLI

You need the binary jflex jar and JDK 8 or later.

You can also use JFlex directly from the command line:

jflex/bin/jflex src/grammar/parser.flex

Or:

java -jar jflex-full-1.8.2.jar -d output src/grammar/parser.flex

Other build tools

See Build tool plugins.

Examples

Have a look at the sample project: simple and other examples.

Contributing

JFlex is free software, contributions are welcome. See the Contributing page for instructions.

Source layout

The top level directory of the JFLex git repository contains:

cup A copy of the CUP runtime
cup-maven-plugin A simple Maven plugin to generate a parser with CUP.
docs the Markdown sources for the user manual
java Java sources [WIP, Bazel]
javatests Java sources of test [WIP, Bazel]
jflex JFlex, the scanner/lexer generator for Java
jflex-maven-plugin the JFlex maven plugin, that helps to integrate JFlex in your project
jflex-unicode-plugin the JFlex unicode maven plugin, used for compiling JFlex
testsuite the regression test suite for JFlex,
third_party third-party librairies used by examples of the Bazel build system

Build from source

Build with Bazel

JFlex can be built with Bazel. Migration to Bazel is still work in progress, concerning the test suite, for instance.

You need Bazel.

bazel build //jflex:jflex_bin

This builds bazel-bin/jflex/jflex_bin, that you can use

bazel-bin/jflex/jflex_bin --info

Or:

bazel run //jflex:jflex_bin -- --info

Build uberjar (aka fatjar aka deploy jar)

bazel build jflex/jflex_bin_deploy.jar

Continuous integration is done with Cirrus CI.

Build with Maven

You need JDK 8 or later.

./mvnw install

This generates jflex/target/jflex-full-1.9.0-SNAPSHOT.jar that you can use, e.g.

java -jar jflex-full-1.9.0-SNAPSHOT.jar --info

Continuous Integration is made with Travis.

Comments

[Bug] Error in skeleton.nested [sf#132]
Reported by jeningar on 2014-09-23 10:37 UTC Hi Steve, hi Gerwin,

using the skeleton.nested, the resulting program showed the following behaviour:

echo "list sessions;" | sdmsh

hangs forever. But

echo "list sessions;" > /tmp/x sdmsh < /tmp/x

teminates as should be.

After merging the skeleton.default and skeleton.nested this erroneous behaviour vanished.

Diffing the old and new skeleton.nested shows (< == new; > == old):

[ronald@cheetah shell]$ diff skeleton.nested* | more 39c39 < --- > 169c169 < while (numRead == 0) { // bug #130 discussion; while is better than if --- > if (numRead == 0) { 403,409d402 < // cached fields: < int zzCurrentPosL; < int zzMarkedPosL; < int zzEndReadL = zzEndRead; < char [] zzBufferL = zzBuffer; < char [] zzCMapL = ZZ_CMAP; < 413c406,411 < zzMarkedPosL = zzMarkedPos; --- > // cached fields: > int zzCurrentPosL; > int zzMarkedPosL = zzMarkedPos; > int zzEndReadL = zzEndRead; > char [] zzBufferL = zzBuffer; > char [] zzCMapL = ZZ_CMAP;

The HUGE difference is that the variables (notably zzEndReadL) aren't initialized by skeleton.default in every iteration of the while loop.

During the merge I wondered why there are two skeletons in the first place. Those who don't want to read from multiple streams just don't call yypushStream() and friends. One skeleton would be more than enough (and a link for backward compatibility).

for the sake of completeness I attached the new skeleton.nested.

Regards,

Ronald
bug
opened by lsf37 27
Circular dependencies on java_cup/javacup

java_cup/javacup needs jflex to build. jflex needs java_cup/javacup to build. Both have circular dependencies on each other requiring bootstrap. No clean way to build either from source without using a pre-made binary. Ideally this should change and allow pieces to be built. Such that either can be built fully from source without requiring a pre-built jar. This must have been the case initially before either was made. Chicken and egg situation. Java seems to have lots of issues with such, jaxen/jdom, antlr/stringtemplate, and jflex/javacup. Of course the JDK itself, though open jdk can technically be built from source going back to 1.5 gcj and gnu-classpath.

Hopefully that is possible here, some old version of either javacup or jflex does not need the other and can be the first step in a building from source solution. At the present time each is having to use binaries which is not preferred. Thank you for your consideration in addressing this circular dependency issue.
question

opened by wltjr 25
[Bug] Re-enable scanning interactively or from a network byte stream [sf#130]
Reported by jeningar on 2014-08-05 13:14 UTC Hi,

We have an application that receives commands over a TCP/IP socket. After each command a reply is sent. Normally the communication is interactive, which means a strict order of questions and answers. This works perfectly with jflex 1.4.x. (The grammar is easy from the jflex point of view. Every command is terminated by a semicolon, no read ahead necessary; this follows your FAQ answer on "I want my scanner to read from a network byte stream or from interactive stdin. Can I do this with JFlex?").

Now we have some problems with Jflex 1.6.0.

Jflex generates following code in zzRefill():

... int requested = zzBuffer.length - zzEndRead; int totalRead = 0; while (totalRead < requested) { int numRead = zzReader.read(zzBuffer, zzEndRead + totalRead, requested - totalRead); if (numRead == -1) { break; } totalRead += numRead; } ...

This code reads bytes from the input as long as the buffer isn't full or until EOF is reached. This is nice in case of reading a file, but leads to deadlocks if using an interactive scanner.

The easiest way to avoid the deadlock is to eliminate the while loop. If the assumption that a read() returns at least one character (or EOF) isn't valid, the while loop must be executed as long as nothing is read. (I tested this; it seems to work flawlessly).

Since I assume there was a good reason for this piece of code, I'd like to have a command line option to eliminate the while loop here. I'd be delighted if someone would explain me why a repeated call of zzRefill() is worse than the while loop. It can't really be a performance issue. The costs of an I/O exceed by far the costs of a function call and a bit of basic arithmetic.

Regards,

Ronald
bug
opened by lsf37 24
Property test improvement

This pull request increases coverage on property tests by improving generators and updating test cases. Improvements made to the generators to ensure cased characters are chosen with more frequency. Tests with caseless options have been added as a boolean with updates to allow caseless to operate correctly. For State Set updates to ensure resize code is called using the new OffsetGen generator. Also, property and additional code added to State Set to improve coverage.

I am a part of a research group at the University of Illinois Chicago that works towards novel methods to detect bugs in software projects. This project is identified as being a good candidate for application of our research method to improve coverage using property tests. The above issue outlines the findings with a pull request of the enhanced property test to document the observed behavior.
testing

opened by jcoultas 18
%unicode 2.0 lexers throw IOOBE on input with surrogate chars

Default %unicode means 10x times static memory footprint (see zzUnpackCMap). Switching to 2.0 leaves Character.codePointAt() generation intact leading to:

java.lang.ArrayIndexOutOfBoundsException: 65536

in advance() method.
bug enhancement

opened by gregsh 18
Generate UnicodeProperties with Bazel
The jflex-unicode-maven-plugin is slow and fetches data from unicode.org which is also not very reliable. As a result, the generated code has been checked in. This is bad practice, and we have been modifying these generated files.

This is the first step in an attempt to replace the jflex-unicode-maven-plugin by Bazel:

Bazel fetches and caches all resources. It can use mirrors. See #522

With, this change, only UnicodeProperties is re-generated, using versions given in the command-line.

Instead of using a custom "skeleton", I'm using Apache Velocity template engine. See #520

I've rewritten more than I wanted because the URL was too much part of the previous model. See DataFileType.java

Effective changes in generated file:

Set default version to 9.0.0 instead of 9.0. This should be a no-op.

Use switch/case rather than chain of ifs

Bump unicode 3.1.0 to 3.1.1
opened by regisd 15
[Bug] Lexers don't work anymore after migration from JFlex 1.4 to JFlex >= 1.5 (infinite loops) [sf#134]

Reported by thierryblind on 2015-01-13 23:42 UTC Hello, I'm a committer for the eclipse PDT project (https://eclipse.org/pdt/) and I'm actually trying to migrate all the lexers (using JFlex 1.4) to use JFlex >= 1.5 Sadly I'm now facing infinite loops because some lexers don't seem to behave the same anymore. I added as attachment a test project that contains some of the problematic lexers (in folder "parserTools"), the generated JFlex 14.3 and 1.5.1 classes, and a test case (src/launch/Tests.java). Could you please have a look and tell me what's happening? Thank you very much for your help,

Thierry.
bug

opened by lsf37 15
[Bug] unexpected Error: could not match input [sf#107]

Reported by kneunert on 2010-03-01 20:56 UTC I'm using a bit of a trick in the Lexer like this:

<YYINITIAL> {identifier}[ { yybegin(ARRAY1); return someSymbol; } <ARRAY1> $? { yybegin(ARRAY2); return someSymbol; } <ARRAY2> $? { yybegin(ARRAY3); return someSymbol; } <ARRAY3> $? { yybegin(YYINITIAL); return someSymbol; } <YYINITIAL> ] { yybegin(YYINITIAL); return someSymbol; }

The trick here is, that i use an unconvential optional character. This Character is not there, so no character gets consumed however a series of symbols are returned. This used to work in JLex and it does not seem to work in JFlex anymore. I get this:

Symbol: [ Exception in thread "main" java.lang.Error: Error: could not match input at struktor.processor.Yylex.zzScanError(Yylex.java:439) at struktor.processor.Yylex.next_token(Yylex.java:590) at struktor.processor.MyMain.main(MyMain.java:19)

I have a simple testcase for this. If needed, i can attach it to this ticket.

Thanks

Kim ( https://sourceforge.net/projects/struktor/ )
bug

opened by lsf37 15
[Bug] buffer expansion bug in yy_refill()? [sf#60]
Reported by smagoun on 2004-01-30 20:35 UTC yy_refill() in skeleton.default and skeleton.nested seems to have a problem expanding the buffer correctly. The bug manifests itself when reading a lot of data at once. I ran into this using the Piccolo XML parser, which uses JFlex to parse XML. Piccolo died while reading a very long CDATA element in the XML. I tracked it to yy_refill(), which seems to have been copied from one of the skeleton files JFlex ships with.

The problem is that the buffer never expands properly when reading long input, which results in an ArrayIndexOutOfBoundsException. The following patch fixes Piccolo; I'm not sure if it applies to JFlex, but I'm guessing it might.

(I'm not convinced that the if() should check yy_currentPos>=buffer.length at all, but it seems harmless)

--- PiccoloLexer.java Sun Jul 7 14:21:18 2002 +++ PiccoloLexer copy.java Fri Jan 30 15:07:44 2004 @@ -3291,9 +3291,10 @@ } /* is the buffer big enough? */ - if (yy_currentPos >= yy_buffer.length) { + if (yy_currentPos >= yy_buffer.length) + || yy_markedPos >= yy_buffer.length) { /* if not: blow it up */ - char newBuffer[] = new char[yy_currentPos*2]; + char newBuffer[] = new char[yy_buffer.length*2]; System.arraycopy(yy_buffer, 0, newBuffer, 0, yy_buffer.length); yy_buffer = newBuffer; }
bug
opened by lsf37 15
%eof{ ... %eof} is not being included when trying to upgrade to v1.8.1 from 1.7.0

Hello,

After trying an upgrade of OpenGrok to JFlex v1.8.1, I found that code specified in %eof{ is not included anymore in the generated lexers. I browsed through the JFlex manual, but I didn't see anything to indicate the handling would have changed.

(I did check but didn't see any other %eof open issues here.)

Please any advice?

Thank you.
bug

opened by idodeclare 14
error: orphaned default and error: 'else' without 'if'
Odd generated output with error: orphaned default and error: 'else' without 'if'.

qdox 1.12.1 builds with 1.4.3, but not 1.6.1 bootstrapped qdox 2 builds with 1.4.3 and 1.6.1 bootstrapped jflex 1.6.1 builds with 1.6.1 bootstrapped but not 1.4.3

The errors are the same when jflex 1.6.1 bootstrapped fails on qdox 1.12.1, and when jflex 1.4.3 fails on jflex 1.6.1. That is odd and I cannot explain.

qdox 1.12.1 under 1.6.1 bootstrapped via binary fails

* Compiling ... src/java/JFlexLexer.java:1357: error: orphaned default default: ^ src/java/JFlexLexer.java:2051: error: 'else' without 'if' else { ^ 2 errors * ERROR: dev-java/qdox-1.12.1-r10::os-xtoo failed (compile phase):

jflex 1.6.1 under 1.4.3 fails, under 1.6.1 bootstrapped via binary it builds fine

Writing code to "java/LexScan.java" * Compiling ... src/main/java/LexScan.java:3697: error: orphaned default default: ^ src/main/java/LexScan.java:3626: error: 'else' without 'if' else { ^ 2 errors

I have 1.4.3 built and running under Java 9. I do not believe the jdk version has anything to do with the generated output with errors. I think that has more to do with syntax or something.
invalid
opened by wltjr 14
UnicodeCaseless.flex.vm uses incorrect unicode escapes
The template uses \uXXXX for all code points, but this syntax is incorrect for code points > 0xFFFF (needs \U instead of \u).

[ ] investigate why this does not lead to test failures

[ ] fix it

testing
opened by lsf37 0
build(deps-dev): bump guava from 30.1.1-jre to 31.1-jre
Bumps guava from 30.1.1-jre to 31.1-jre.

Release notes

Sourced from guava's releases.

31.1

Maven

<dependency> <groupId>com.google.guava</groupId> <artifactId>guava</artifactId> <version>31.1-jre</version>  <version>31.1-android</version> </dependency>

Jar files

31.1-jre.jar

31.1-android.jar

Guava requires one runtime dependency, which you can download here:

failureaccess-1.0.1.jar

Javadoc

31.1-jre

31.1-android

JDiff

31.1-jre vs. 31.0.1-jre

31.1-android vs. 31.0.1-android

31.1-android vs. 31.1-jre

Changelog

base: Deprecated the Throwables methods lazyStackTrace and lazyStackTraceIsLazy. They are no longer useful on any current platform. (6ebd7d8648)

collect: Added a new method ImmutableMap.Builder.buildKeepingLast(), which keeps the last value for any given key rather than throwing an exception when a key appears more than once. (68500b2c09)

collect: As a side-effect of the buildKeepingLast() change, the idiom ImmutableList.copyOf(Maps.transformValues(map, function)) may produce different results if function has side-effects. (This is not recommended.) (68500b2c09)

hash: Added Hashing.fingerprint2011(). (13f703c25f)

io: Changed ByteStreams.nullOutputStream() to follow the contract of OutputStream.write by throwing an exception if the range of bytes is out of bounds. (1cd85d01c9)

net: Added @CheckReturnValue to the package (with a few exceptions). (a0e2577de6)

net: Added HttpHeaders constant for Access-Control-Allow-Private-Network. (6dabbdf9c9)

util.concurrent: Added accumulate/update methods for AtomicDouble and AtomicDoubleArray. (2d875d327a)

APIs promoted from @Beta

base: Throwables methods getCausalChain and getCauseAs (dd462afa6b)

collect: Streams methods mapWithIndex and findLast (8079a29463)

collect: the remaining methods in Comparators: min, max, lexicographical, emptiesFirst, emptiesLast, isInOrder, isInStrictOrder (a3e411c3a4)

escape: various APIs (468c68a6ac)

... (truncated)

Commits

See full diff in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR

@dependabot recreate will recreate this PR, overwriting any edits that have been made to it

@dependabot merge will merge this PR after your CI passes on it

@dependabot squash and merge will squash and merge this PR after your CI passes on it

@dependabot cancel merge will cancel a previously requested merge and block automerging

@dependabot reopen will reopen this PR if it is closed

@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually

@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)

@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

dependencies
opened by dependabot[bot] 0
allow length-1 expressions in char classes
Since we now allow macro uses in char class content, we may encounter arbitrary regular expressions. #996 turns these into properly reported errors, but we could actually allow all expressions that have a fixed length of 1. Some examples that would make sense:

%% M = a %% [ {M} ] { }

%% M = "a" %% [ {M} ] { }

%% M = [a] | [b] %% [ {M} ] { }

(This is a follow-on from #888)
enhancement
opened by lsf37 1
replace inefficient java regexps with jflex scanner
They might also be a bit easier to read..

Tracking issue for:

[ ] https://github.com/jflex-de/jflex/security/code-scanning/12

[ ] https://github.com/jflex-de/jflex/security/code-scanning/13

code quality
opened by lsf37 0
Is there a way to suppress comment/absolute path in the generated code?
This is really annoying when you generate on different machines/oses and your resulting class is under VCS. Path is already changing.

I'm talking about

* This class is a scanner generated by * <a href="http://www.jflex.de/">JFlex</a> 1.7.0-1 ....
enhancement
opened by hurricup 5

Releases(v1.8.2)

v1.8.2(May 3, 2020)
JFlex 1.8.2 is a small bugfix release. There are no new features.

fix bug that prevented %7bit scanners from being generated (#756)

fix %eof{ and %eofthrow{ code generation (#743)

updated bazel build rule

More detailed list of changes in milestone 1.8.2
Source code(tar.gz)
Source code(zip)
jflex-1.8.2.tar.gz(4.69 MB)
jflex-1.8.2.tar.gz.asc(833 bytes)
jflex-1.8.2.tar.gz.sha1(61 bytes)
jflex-1.8.2.zip(4.81 MB)
jflex-1.8.2.zip.asc(833 bytes)
jflex-1.8.2.zip.sha1(58 bytes)
manual.pdf(439.04 KB)
v1.8.1(Feb 28, 2020)
JFlex 1.8.1 is a small maintenance release. There are no new features or bug fixes. The only change is

in dependency management for the CUP parser generator and runtime to re-enable building from source in the release package (#734)

More detailed list of changes in milestone 1.8.1
Source code(tar.gz)
Source code(zip)
jflex-1.8.1.tar.gz(4.69 MB)
jflex-1.8.1.tar.gz.asc(833 bytes)
jflex-1.8.1.tar.gz.sha1(61 bytes)
jflex-1.8.1.zip(4.81 MB)
jflex-1.8.1.zip.asc(833 bytes)
jflex-1.8.1.zip.sha1(58 bytes)
manual.pdf(438.99 KB)
v1.8.0(Feb 26, 2020)
yychar type has been changed from int to long in order to support large files (> 2GB) (#605)

Add @SuppressWarnings("FallThrough") on generated lexer #454

Defend against spoon-feeding readers not fully populating the scanning buffer #543

Add support for Unicode 10.0 #540 11.0 #555 12.0 #556 and 12.1 #563

Unicode Emoji properties are supported for Unicode versions 8.0+ (#546)

Significantly decreased memory usage for unicode scanners from ~4MB to typical ~20kB. (#697)

Macro expressions in character classes are now allowed (#216, #654)

Expose yyatEOF() in generated scanner API (#644)

Pipe action | now works for <<EOF>> (#201)

Explicitly use UTF-8 encoding for skeleton files and dot files (#470)

Maven plugin now correctly checks #include file time stamp (#694)

Slightly optimised character classes when ^ operator is used (#682)

Normalised character class order. This has no influence on how text is matched, but makes --dump output more comparable. (#650)

Fixed a bug in the negation ! operator that in rare circumstances would match not everything covered by the negation (#567).

The . expression now does not match unpaired surrogates, since these are not characters. (#544)

Example specs now with build for ant, make, and maven

Introduced a code LexGenerator API. #428 #448

Add the jflex source in generated code #371 #399

Code cleanup

modularisation effort

Removed dead code class CharSet #480

Use @AutoValue #505

Fixed PMD violations #413 #418

Use Truth in tests #365 #660

Replace commons-io by guava #319

Dep updates

Updated maven dependencies #409

Updated the Maven wrapper to 0.4.2 #382

Build system

retired ant build #432

now supporting Bazel build

See all changes in milestone 1.8.0
Source code(tar.gz)
Source code(zip)
jflex-1.8.0.tar.gz(4.69 MB)
jflex-1.8.0.tar.gz.asc(833 bytes)
jflex-1.8.0.tar.gz.sha1(61 bytes)
jflex-1.8.0.zip(4.81 MB)
jflex-1.8.0.zip.asc(833 bytes)
jflex-1.8.0.zip.zip.sha1(58 bytes)
manual.pdf(439.19 KB)
untagged-7b543e39f8064f99874f(Oct 10, 2018)

null
Source code(tar.gz)
Source code(zip)
manual.pdf(432.14 KB)
v1.7.0(Sep 21, 2018)
Prerequisites

Compilation requires jdk7 and Maven 3.5.2

Execution requires jdk7 and Maven 3.0

Compilation of generated code requires jdk 5

CUP upgraded to 0.11b

Option --inputstreamctor has been removed (#195)

Code health

Codebase has valid doclint (#206)

Maven plugins update to use Java annotations rather than javadoc at-clauses.

jflex --version or --info or --help now exits with error code 0 (#194)

Unicode 8.0 and 9.0 are supported (#209)

documentation improvements (#152, #187, #215, #290)

added an --encoding option to specify input/output encoding (#164)

make jflex start script robust for other locales (#251)

report character position when %debug and %char are present (#207)

See https://github.com/jflex-de/jflex/milestone/10
Source code(tar.gz)
Source code(zip)
jflex-1.7.0.tar.gz(3.48 MB)
jflex-1.7.0.tar.gz.asc(833 bytes)
jflex-1.7.0.tar.gz.sha1(61 bytes)
jflex-1.7.0.zip(3.55 MB)
jflex-1.7.0.zip.asc(833 bytes)
jflex-1.7.0.zip.sha1(58 bytes)
jflex-full-1.7.0.jar(1.23 MB)
jflex-full-1.7.0.jar.asc(854 bytes)
manual.pdf(455.54 KB)
v1.6.1(Nov 9, 2017)
Released 2015-03-16

1.6.1 is a maintenance release, fixing all known defects.

Changelog:

JFlex development, wiki, and issue tracker moved to https://github.com/jflex-de/

Fixed issue #130, "in caseless mode, chars in regexps not accepted caselessly": Caseless option works again as intended.

Fixed issue #131, "re-enable scanning interactively or from a network byte stream": JFlex now throws an IOException when a Reader returns 0 characters.

New example, shows how to deal with Readers that return 0 characters.

Command line scripts work again in repository version (contributed by Emma Strubell)

New options --warn-unused and --no-warn-unused that control warnings about unused macros.

Fixed issue #125: %apiprivate and %cup2 switches now no longer incompatible

Fix issue #133, "Error in skeleton.nested": Empty-string matches were taking precedence over EOF and caused non-termination. Now EOF is counted as the highest-priority empty match.

New warning when an expression matches the empty string (can lead to non-termination).

Source code(tar.gz)
Source code(zip)
jflex-1.6.1.tar.gz(2.88 MB)
jflex-1.6.1.zip(2.96 MB)
jflex-maven-plugin-1.6.1.tar.gz(13.92 KB)
jflex-maven-plugin-1.6.1.zip(31.51 KB)
v1.6.0(Nov 9, 2017)
Released 2014-06-21

Unicode 7.0 is supported.

In %unicode mode, supplementary code points are now handled properly.

Regular expressions are now code-point based, rather than code-unit/ char based.

Input streams are read as code point sequences - properly paired surrogate code units are read as a single character.

All supported Unicode properties now match supplementary characters when Unicode 3.0 or above is specified, or when no version is specified, causing the default Unicode version, Unicode 7.0 in this release, to be used.

New \u{...} escape sequence allows code points (and whitespace-separated sequences of code points) to be specified as 1-6 hexadecimal digit values.

Characters in matches printed in %debug mode are now Unicode escaped (\uXXXX) when they are outside the range 32..127.

detect javadoc class comment when followed by annotation(s) (#128)

removed the "switch" and "table" code generation options

Option --noinputstreamctor deprecated. By default no InputStream constructor is included in the generated scanner. The capability to include one is deprecated and will be removed in JFlex 1.7.

Source code(tar.gz)
Source code(zip)
release_1_5_1(Nov 7, 2017)
fixed problem calling ./jflex start scripts (#127)

corrected documentation flaws (#126)

further documentation and website updates

JFlex now reports the correct version string

added support for CUP2 with %cup2 switch, based on patch by Andreas Wenger

Source code(tar.gz)
Source code(zip)
v1.5.0(Nov 29, 2019)
Released 2014-03-23

the "switch" and "table" code generation options are deprecated and will be removed in JFlex 1.6

the JFlex license has been changed from GPL to BSD.

updated JFlex to CUP version 0.11a.

changed the build from Ant to Maven. 523d7a9

JFlex now mostly conforms with Unicode Regular Expressions UTS#18 Basic Unicode Support - Level 1. Supplementary code points (above the Basic Multilingual Plane) are not yet supported.

new meta characters supported: \s, \S, \d, \D, \w, \W.

nested character sets now supported, e.g. [[[ABC]D]E[FG]]

new character set operations supported: union (e.g. [A||B]), intersection (e.g. [A&&B]), set-difference (e.g. [A--B]), and symmetric difference (e.g. [A~~B]).

the meaning of the dot (".") meta character has been changed from [^\n] to [^\n\r\u000B\u000C\u0085\u2028\u2029]. Use the new --legacydot option to cause "." to be interpreted as [^\n].

new \R meta character matches any newline: "\r\n" | [\n\r\u000B\u000C\u0085\u2028\u2029].

new option --noinputstreamctor to not include an InputStream constructor in the generated scanner.

%include can now be used in the rules section (#117)

yychar and zzAtBOL should be reset for nested input streams (#107 & #108 )

fixed bug #109 (could not match input for empty string matches.)

fixed bug #112 & #119 (properly update zzFin when reallocating zzBuffer)

fixed bug #115 (noncompileable scanner generation when default locale is Turkish)

fixed bug #114 (zzEOFDone not included with pushed nested stream state)

fixed bug #105 (can't build examples/java/)

fixed bug #106 (impossible char class range should trigger syntax error)

Source code(tar.gz)
Source code(zip)
release_1_4_3(Nov 7, 2017)
Released 2009-01-31

fixed bug #100 (lookahead syntax error)

fixed bug #97 (min_int in Java example scanner)

fixed bug #96 (zzEOFDone not reset in yyreset(Reader))

fixed bug #95 (%type and %int at the same time should produce error msg)

Source code(tar.gz)
Source code(zip)
release_1_4_2(Nov 7, 2017)
Released 2008-05-28

implemented feature request #75: Now supports generics syntax for %type, %extends, etc

implemented feature request #156: Provided %ctorarg option to add arguments to constructor

fixed bug #80 (Reader.read might return 0)

fixed bug #57 (Ambiguous error message in macro expansion)

fixed bug #89 (Syntax error in input may cause NullPointerException)

fixed bug #85 (Need to defend against path blanks in jflex bash script)

fixed bug #82 (EOF actions may be ignored for same lex state)

fixed bug #81 (syntax error in generated ZZ_CMAP)

fixed bug #77 (lookahead and "|" actions)

fixed bug #74 (yytext() longer than expected with lookahead)

fixed bug #73 (OS/2 Java 1.1.8 Issues)

fixed bug #40 (dangerous lookahead check may fail)

Source code(tar.gz)
Source code(zip)
release_1_4_1(Nov 7, 2017)
Released 2004-11-07

merged in patch by Don Brown (fixes #70 Uses Old JUnit method assertFalse)

merged in patch by Don Brown (fixes #62 buffer expansion bug in yy_refill()) Thanks to Binesh Bannerjee for providing a simpler test case for this problem.

fixed bug #69 (ArrayIndexOutOfBounds in IntCharSet)

fixed bug #68 (Cannot use lookahead with ignorecase)

converted dangerous lookahead error to warning

print info for EOF actions as well in %debug mode

fixed line number count for EOF actions

internal: removed unused methods in LexScan.flex and IntCharSet

Source code(tar.gz)
Source code(zip)
release_1_4(Nov 7, 2017)
Released 2004-04-12

new, very fast minimization algorithm (also fixes memory issues)

new --jlex option for strict compatibility to JLex. Currently it changes %ignorecase to JLex semantics, that is, character classes are interpreted in a caseless way, too.
(fixes bus #59, %ignorecase ignored by char classes). Thanks to Edward D. Willink for spotting the incompatibility.

support for even larger scanners (up to 64K DFA states). Thanks to Karin Vespoor.

removed eclipse compiler warnings for generated classes (feature request #144)

implemented faster character classes (feature request #143). Expressions like [a-z] | [A-Z] are interpreted as one atomic class [a-zA-Z], reducing NFA states and generation time significantly for some specifications. This affects the generation process only, generated scanners remain the same.

new %apiprivate switch (feature request #141/1) that causes all generated and skeleton methods to be made private. Exceptions to this are user defined functions and the constructor. Thanks to Stephen Ostermiller for the suggestion.

allow user defined javadoc class comments (feature request #141/2) If the user code section ends with a javadoc comment, JFlex takes this instead of the generated comment. Thanks to Stephen Ostermiller for the suggestion.

fixed bug #50 (undefined macros in complement expressions do not throw exception in generator). Thanks to Stephen Ostermiller for the bug report.

fixed bug #51 (yypushStream/yypopStream in skeleton.nested work as advertised)

fixed bug #57 (no wrong macro warnings on regexp negation)

fixed bug #58 (%cupsym now also affects %cupdebug) Thanks to Eric Schweitz for the fix.

fixed bug #52 (single-line %initthrow works now in case of extra whitespace before newline)

yyreset() does no longer close the associated reader (use yyclose() explicitly for that). Makes some reader objects reusable (feature request #140). Thanks to Stephen Ostermiller for the suggestion.

fixed modifier order in generated code, removes jikes compiler warnings Thanks to Michael Wildpaner for the fix.

ant task now also works with ant >= 1.4 (fixes bug #54)

yyreset() does not declare an execption any more (fixes bug #65)

%cup does not include %eofclose in JLex mode (--jlex). (Fixes bug #63)

optional parameter to %eofclose: "%eofclose false" turns off %eofclose if it was turned on previously (e.g. by %cup). (Fixes bug #63)

jflex build script switched to ant

internal: central Options class for better integration with build tools and IDEs

internal: change naming scheme for generated internal variables from yy_ to zz to comply with Java naming standard. Thanks to Max Gilead for the patch.

Source code(tar.gz)
Source code(zip)