Apache Solr is an enterprise search platform written in Java and using Apache Lucene.

Overview

Apache Solr

Apache Solr is an enterprise search platform written in Java and using Apache Lucene. Major features include full-text search, index replication and sharding, and result faceting and highlighting.

Build Status

Online Documentation

This README file only contains basic setup instructions. For more comprehensive documentation, visit https://solr.apache.org/guide/

Building with Gradle

Firstly, you need to set up your development environment (OpenJDK 11 or greater).

We'll assume that you know how to get and set up the JDK - if you don't, then we suggest starting at https://jdk.java.net/ and learning more about Java, before returning to this README. Solr runs with Java 11 and later.

As of 9.0, Solr uses Gradle as the build system. Ant build support has been removed.

To build Solr, run (./ can be omitted on Windows):

./gradlew assemble

NOTE: DO NOT use gradle command that is already installed on your machine (unless you know what you'll do). The "gradle wrapper" (gradlew) does the job - downloads the correct version of it, setups necessary configurations.

The first time you run Gradle, it will create a file "gradle.properties" that contains machine-specific settings. Normally you can use this file as-is, but it can be modified if necessary.

The command above packages a full distribution of Solr server; the package can be located at:

solr/packaging/build/solr-*

Note that the gradle build does not create or copy binaries throughout the source repository so you need to switch to the packaging output folder above; the rest of the instructions below remain identical. The packaging directory is rewritten on each build.

For development, especially when you have created test indexes etc, use the ./gradlew dev task which will copy binaries to ./solr/packaging/build/dev but only overwrite the binaries which will preserve your test setup.

If you want to build the documentation, type ./gradlew -p solr documentation.

Running Solr

After building Solr, the server can be started using the bin/solr control scripts. Solr can be run in either standalone or distributed (SolrCloud mode).

To run Solr in standalone mode, run the following command from the solr/ directory:

bin/solr start

To run Solr in SolrCloud mode, run the following command from the solr/ directory:

bin/solr start -c

The bin/solr control script allows heavy modification of the started Solr. Common options are described in some detail in solr/README.txt. For an exhaustive treatment of options, run bin/solr start -h from the solr/ directory.

Gradle build and IDE support

  • IntelliJ - IntelliJ idea can import the project out of the box. Code formatting conventions should be manually adjusted.
  • Eclipse - Not tested.
  • Netbeans - Not tested.

Gradle build and tests

./gradlew assemble will build a runnable Solr as noted above.

./gradlew check will assemble Solr and run all validation tasks unit tests.

./gradlew help will print a list of help commands for high-level tasks. One of these is helpAnt that shows the gradle tasks corresponding to ant targets you may be familiar with.

Contributing

Please review the Contributing to Solr Guide for information on contributing.

Discussion and Support

Comments
  • SOLR-15089: Allow backup/restoration to Amazon's S3 blobstore

    SOLR-15089: Allow backup/restoration to Amazon's S3 blobstore

    Description

    Solr provides a BackupRepository interface with which users can create backups to arbitrary backends. There is now a GCS implementation (see https://github.com/apache/solr/pull/39), but no S3 impl yet.

    Solution

    This PR adds a BackupRepository implementation for communicating with S3.

    Tests

    We've added new unit tests at the BackupRepository level as well as tests for the S3 interactions (using S3Mock framework).

    Checklist

    Please review the following and check all that apply:

    • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [x] I have created a Jira issue and added the issue ID to my pull request title.
    • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [x] I have developed this patch against the main branch.
    • [x] I have run ./gradlew check.
    • [x] I have added tests for my changes.
    • [x] I have added documentation for the Reference Guide

    We have not yet done the work of adding license files for all the newly added libraries/dependencies. These will be added in a future commit.

    opened by athrog 42
  • Dynamically discover lucene version for use in build

    Dynamically discover lucene version for use in build

    The test :solr:validateConfigFileSanity checks that <luceneMatchVersion> is correct in various solrconfig files. Now that lucene version is different from solr version (e.g. on main branch right now), the build needs to know what lucene version we have.

    I tried using Lucene's Version.LATEST.toString(), but the build found a 8_10_0 version in my env. So now I try to just pull it from versions.props, and that works.

    However, it may noe be compatible with the local lucene version override feature in lucene-dev-repo-composite.gradle??

    opened by janhoy 39
  • SOLR-15955: Update Jetty dependency to 10

    SOLR-15955: Update Jetty dependency to 10

    https://issues.apache.org/jira/browse/SOLR-15955

    Summary:

    • Upgrades to Jetty 10.0.12
    • dropwizard metrics 4.2.12 for dropwizard-metrics9 -> dropwizard-metrics10
    • log4j 2.19.0 and slf4j 2.0.3
    • for s3mock specifically upgrade spring-boot 2.5.14 and spring 5.3.23 to handle Jetty 10
    opened by markrmiller 34
  • SOLR-15824 Improved Query Screen raw query parameters section

    SOLR-15824 Improved Query Screen raw query parameters section

    https://issues.apache.org/jira/browse/SOLR-15824

    Description

    While sending a query in solr, it is very convenient that the q field can be resizable(because it's in textarea tag), while the fq and raw query parameters fields are not resizable, which can cause difficulties in long query parameters. To solve this made improvement on query.html and query.css.

    Solution

    I put fq and raw query parameters inside textarea tag and with css I made it vertical resizable because of the line between query and result. Because if it was resizable both vertical and horizontal or only horizontal, it would cause the appearance of the panel to deteriorate.

    Checklist

    Please review the following and check all that apply:

    • [X] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [X] I have created a Jira issue and added the issue ID to my pull request title.
    • [X] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [X] I have developed this patch against the main branch.
    • [X] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [ ] I have added documentation for the Reference Guide

    EDIT

    I changed only raw query parameters section because of the comments on PR. Added +/- buttons for raw query parameters.

    opened by betulince 27
  • SOLR-16271: remove wildcard imports

    SOLR-16271: remove wildcard imports

    https://issues.apache.org/jira/browse/SOLR-16271

    Description

    Remove wildcard imports from the build. In later issue, will add a spotless based check.

    Solution

    made the changes.

    Tests

    manually run tests.

    Checklist

    Please review the following and check all that apply:

    • [ ] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [ ] I have created a Jira issue and added the issue ID to my pull request title.
    • [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [ ] I have developed this patch against the main branch.
    • [ ] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [ ] I have added documentation for the Reference Guide
    opened by epugh 24
  • SOLR-15982: Add end time value to backup response

    SOLR-15982: Add end time value to backup response

    https://issues.apache.org/jira/browse/SOLR-15982

    Description

    Adding new field endTime to collection backup response (response is actually an aggregated result of responses from multiple nodes).

    Solution

    After backup finished (regardless is it sync or async) it will write backup.properties file to repository. Actually it writes the content of the org.apache.solr.core.backup.BackupProperties. Before writing backup properties data the value endTime is filled at org.apache.solr.core.backup.BackupProperties.store(Writer):

        public void store(Writer propsWriter) throws IOException {
            properties.put("indexSizeMB", String.valueOf(indexSizeMB));
            properties.put("indexFileCount", String.valueOf(indexFileCount));
            properties.put(BackupManager.END_TIME_PROP, Instant.now().toString());
            properties.store(propsWriter, "Backup properties file");
        }
    

    Exactly after writing backup properties file an additional field endTime will be appended to the response within org.apache.solr.cloud.api.collections.BackupCmd.call(ClusterState, ZkNodeProps, NamedList<Object>) call:

      public void call(ClusterState state, ZkNodeProps message, NamedList<Object> results) throws Exception {
        ...
        try (BackupRepository repository = cc.newBackupRepository(repo)) {
          ...
          backupMgr.writeBackupProperties(backupProperties);
    
          if(backupProperties != null) {
            NamedList<Object> response = (NamedList<Object>) results.get("response");
            response.add("endTime", backupProperties.getEndTime());
          }
          ...
        }
      }
    

    Tests

    None

    Checklist

    Please review the following and check all that apply:

    • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [x] I have created a Jira issue and added the issue ID to my pull request title.
    • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [x] I have developed this patch against the main branch.
    • [x] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [x] I have added documentation for the Reference Guide
    opened by ijioio 23
  • SOLR-15342: Separate out a SolrJ-Zookeeper module

    SOLR-15342: Separate out a SolrJ-Zookeeper module

    https://issues.apache.org/jira/browse/SOLR-15342

    I open this PR in draft to share the evolution of the work. I closed the old PR since it has been open for a while, there have been quite a few changes since then in relation to ZooKeeper and some of the work has been contributed in a separate PR. I resume the work done in the latter in this one

    opened by heythm 21
  • SOLR-15842: Fix async backup response

    SOLR-15842: Fix async backup response

    https://issues.apache.org/jira/browse/SOLR-15842

    Description

    Adding new field to org.apache.solr.handler.admin.CoreAdminHandler.TaskObject to hold operation results.

    Solution

    I apply a simple changes that adds an additional field operationRspInfo to org.apache.solr.handler.admin.CoreAdminHandler.TaskObject. It is filled with operation results in case of operation finished successfully (it is exactly the same results that are used in case of sync request). This way we store the results within TaskObject.

    Later, when request status is sent, besides adding standard value Response, an additional value response will be added to request response. Value response will hold operation results preserved in TaskObject.

    Tests

    None

    Checklist

    Please review the following and check all that apply:

    • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [x] I have created a Jira issue and added the issue ID to my pull request title.
    • [x] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [x] I have developed this patch against the main branch.
    • [ ] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [ ] I have added documentation for the Reference Guide
    opened by ijioio 20
  • SOLR-10452: setQueryParams should be deprecated in favor of SolrClientBuilder methods

    SOLR-10452: setQueryParams should be deprecated in favor of SolrClientBuilder methods

    https://issues.apache.org/jira/browse/SOLR-10452

    Description

    setQueryParams should be deprecated in favor of SolrClientBuilder methods.

    Solution

    I moved the the setQueryParams over for the Http2SolrClient, it already had been done for HttpSolrClient. I noticed we have a addQueryParams method, which I marked deprecated and we shouldn't use, as that goes against the idea of a Solr Client being immutable.

    One area I tried to fix and gave up on was the DelegationTokenHttpSolrClient, I couldn't quite figure out what to do there, would love a suggestion or a fix ;-)

    Tests

    Please describe the tests you've developed or run to confirm this patch implements the feature or solves the problem.

    Checklist

    Please review the following and check all that apply:

    • [ ] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [ ] I have created a Jira issue and added the issue ID to my pull request title.
    • [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [ ] I have developed this patch against the main branch.
    • [ ] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [ ] I have added documentation for the Reference Guide
    opened by epugh 19
  • SOLR-16574: Demonstrate Dense Vectors and KNN as part of the Films example

    SOLR-16574: Demonstrate Dense Vectors and KNN as part of the Films example

    https://issues.apache.org/jira/browse/SOLR-16574

    Description

    Enrich the films example to demonstrate how to use the Dense Vectors feature.

    Solution

    Added the field film_vector to the films dataset. This is an embedding vector created to represent the movie with 10 dimensions. The vector is created by combining the first 5 dimensions of a pre-trained BERT sentence model applied on the name of the movies plus the name of the genres, followed by an item2vec 5-dimensions model of itemset co-occurrence of genres in the movies, totaling 10 dimensions. Even though it is expected that similar movies will be close to each other, this is just a "toy example" model to serve as source for creating the films vectors.

    The README of the example was also updated to include the specification of the Dense Vector field in the schema. Also, a new section was created, with examples showing how to make KNN queries with the vectors.

    Tests

    • Added the new field film_vector to the 3 dataset formats (JSON, XML, CSV), making sure to preserve the exact same data from the original datasets, so that the "diff" will be only the appendage of the new field.
    • Checked the creation of the collection for the 3 dataset formats. Regardless of the format all the 1100 films were indexed, and the film_vector field was correctly parsed and indexed as well.
    • Checked the KNN example queries for all the 3 dataset formats.

    Checklist

    Please review the following and check all that apply:

    • [X] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [X] I have created a Jira issue and added the issue ID to my pull request title.
    • [X] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [X] I have developed this patch against the main branch.
    • [ ] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [ ] I have added documentation for the Reference Guide
    opened by gabrielmagno 19
  • SOLR-16368: Use Builder Pattern with Solr Clients

    SOLR-16368: Use Builder Pattern with Solr Clients

    https://issues.apache.org/jira/browse/SOLR-16368

    Description

    Part of working on reducing the use of legacy HttpSolrClient in the tests everywhere is seeing if mutating the client can be reduced by embracing the Builder pattern.

    Solution

    Use the Builder where possible.

    I couldn't figure out how to untangle the logic in TestRandomFlRTGCloud and would love some eyes on that one!

    Tests

    ran the tests

    Checklist

    Please review the following and check all that apply:

    • [ ] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [ ] I have created a Jira issue and added the issue ID to my pull request title.
    • [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [ ] I have developed this patch against the main branch.
    • [ ] I have run ./gradlew check.
    • [ ] I have added tests for my changes.
    • [ ] I have added documentation for the Reference Guide
    opened by epugh 19
  • SOLR-16613: CryptoKeys should handle RSA padding for OpenJ9

    SOLR-16613: CryptoKeys should handle RSA padding for OpenJ9

    https://issues.apache.org/jira/browse/SOLR-16613

    Tested with:

    # With Temurin 17
    ./gradlew test --tests TestRSAKeyPair
    ./gradlew test --tests TestPKIAuthenticationPlugin
    
    # With OpenJ9 17
    RUNTIME_JAVA_HOME=/Library/Java/JavaVirtualMachines/ibm-semeru-open-17.jdk/Contents/Home ./gradlew test --tests TestRSAKeyPair
    RUNTIME_JAVA_HOME=/Library/Java/JavaVirtualMachines/ibm-semeru-open-17.jdk/Contents/Home ./gradlew test --tests TestPKIAuthenticationPlugin
    
    opened by risdenk 1
  • SOLR-16532 Further improvements to opentelemetry module

    SOLR-16532 Further improvements to opentelemetry module

    https://issues.apache.org/jira/browse/SOLR-16532

    I created a new PR where we can gather all followup cleanups that may arrive after the initial merge in #1168

    opened by janhoy 1
  • SOLR-16610: Support Copy n Paste of Command Line commands in Ref Guide

    SOLR-16610: Support Copy n Paste of Command Line commands in Ref Guide

    https://issues.apache.org/jira/browse/SOLR-16610

    Description

    This is an example of the types of changes we would need to make to be more in line with Antora's handling of command lines and being able to cut n paste them.

    Solution

    Conversion

    Tests

    manual

    opened by epugh 5
  • SOLR-16608: Ability to compress the collection state

    SOLR-16608: Ability to compress the collection state

    https://issues.apache.org/jira/browse/SOLR-16608

    Description

    This PR provides the ability to configured a minimum size of state.json above which it will be compressed when written to Zookeeper. Solr will be able to handle compressing/decompressing in ZLib format in all areas where Solr reads state.json from Zookeeper.

    Solution

    This uses ZLib compression to optionally compress state.json, the core of this logic is contained within a new class called CompressionUtil that handles both compression and decompression as well as some logic to check if bytes are compressed in a very efficient way. This can handle reading/writing both compressed and uncompressed data so it changing the configuration for compression is backward compatible and will not break reading state.json.

    Tests

    There are tests added to check various layers to ensure compression/decompression is working, including unit tests for CompressionUtil and tests for ZkStateReader and ZkStateWriter.

    Checklist

    Please review the following and check all that apply:

    • [x] I have reviewed the guidelines for How to Contribute and my code conforms to the standards described there to the best of my ability.
    • [x] I have created a Jira issue and added the issue ID to my pull request title.
    • [ ] I have given Solr maintainers access to contribute to my PR branch. (optional but recommended)
    • [x] I have developed this patch against the main branch.
    • [x] I have run ./gradlew check.
    • [x] I have added tests for my changes.
    • [x] I have added documentation for the Reference Guide
    opened by justinrsweeney 4
  • SOLR-788: transfer mlt query for component right

    SOLR-788: transfer mlt query for component right

    https://issues.apache.org/jira/browse/SOLR-788 I think this is a somewhat reliable solution for the problem with distributed mlt component. Reminder, now it naively relies on parsing booleanQuery.toString() that's unacceptable.

    opened by mkhludnev 0
Owner
The Apache Software Foundation
The Apache Software Foundation
Apache Lucene is a high-performance, full featured text search engine library written in Java.

Apache Lucene is a high-performance, full featured text search engine library written in Java.

The Apache Software Foundation 1.4k Jan 5, 2023
A proof-of-concept serverless full-text search solution built with Apache Lucene and Quarkus framework.

Lucene Serverless This project demonstrates a proof-of-concept serverless full-text search solution built with Apache Lucene and Quarkus framework. ✔️

Arseny Yankovsky 38 Oct 29, 2022
Apache Lucene.NET

Apache Lucene.NET Full-text search for .NET Apache Lucene.NET is a .NET full-text search engine framework, a C# port of the popular Apache Lucene proj

The Apache Software Foundation 1.9k Jan 4, 2023
A simple fast search engine written in java with the help of the Collection API which takes in multiple queries and outputs results accordingly.

A simple fast search engine written in java with the help of the Collection API which takes in multiple queries and outputs results accordingly.

Adnan Hossain 6 Oct 24, 2022
Free and Open, Distributed, RESTful Search Engine

Elasticsearch A Distributed RESTful Search Engine https://www.elastic.co/products/elasticsearch Elasticsearch is a distributed RESTful search engine b

elastic 62.3k Dec 31, 2022
GitHub Search Engine: Web Application used to retrieve, store and present projects from GitHub, as well as any statistics related to them.

GHSearch Platform This project is made of two subprojects: application: The main application has two main responsibilities: Crawling GitHub and retrie

SEART - SoftwarE Analytics Research Team 68 Nov 25, 2022
OpenSearch is an open source distributed and RESTful search engine.

OpenSearch is an open source search and analytics engine derived from Elasticsearch

null 6.2k Jan 1, 2023
🔍An open source GitLab/Gitee/Gitea code search tool. Kooder 是一个为 Gitee/GitLab 开发的开源代码搜索工具,这是一个镜像仓库,主仓库在 Gitee。

Kooder is a open source code search project, offering code, repositories and issues search service for code hosting platforms including Gitee, GitLab and Gitea.

开源中国 350 Dec 30, 2022
filehunter - Simple, fast, open source file search engine

Simple, fast, open source file search engine. Designed to be local file search engine for places where multiple documents are stored on multiple hosts with multiple directories.

null 32 Sep 14, 2022
Simple full text indexing and searching library for Java

indexer4j Simple full text indexing and searching library for Java Install Gradle repositories { jcenter() } dependencies { compile 'com.haeun

Haeun Kim 47 May 18, 2022
Apache Lucene and Solr open-source search software

Apache Lucene and Solr have separate repositories now! Solr has become a top-level Apache project and main line development for Lucene and Solr is hap

The Apache Software Foundation 4.3k Jan 7, 2023
Path Finding Visualizer for Breadth first search, Depth first search, Best first search and A* search made with java swing

Path-Finding-Visualizer Purpose This is a tool to visualize search algorithms Algorithms featured Breadth First Search Deapth First Search Gready Best

Leonard 11 Oct 20, 2022
Apache Lucene is a high-performance, full featured text search engine library written in Java.

Apache Lucene is a high-performance, full featured text search engine library written in Java.

The Apache Software Foundation 1.4k Jan 5, 2023
A proof-of-concept serverless full-text search solution built with Apache Lucene and Quarkus framework.

Lucene Serverless This project demonstrates a proof-of-concept serverless full-text search solution built with Apache Lucene and Quarkus framework. ✔️

Arseny Yankovsky 38 Oct 29, 2022
Search API with spelling correction using ngram-index algorithm: implementation using Java Spring-boot and MySQL ngram full text search index

Search API to handle Spelling-Corrections Based on N-gram index algorithm: using MySQL Ngram Full-Text Parser Sample Screen-Recording Screen.Recording

Hardik Singh Behl 5 Dec 4, 2021
🔍 Open Source Enterprise Cognitive Search Engine

OpenK9 OpenK9 is a new Cognitive Search Engine that allows you to build next generation search experiences. It employs a scalable architecture and mac

SMC 24 Dec 10, 2022
The Chronix Server implementation that is based on Apache Solr.

Chronix Server The Chronix Server is an implementation of the Chronix API that stores time series in Apache Solr. Chronix uses several techniques to o

Chronix 262 Jul 3, 2022
Apache Lucene.NET

Apache Lucene.NET Full-text search for .NET Apache Lucene.NET is a .NET full-text search engine framework, a C# port of the popular Apache Lucene proj

The Apache Software Foundation 1.9k Jan 4, 2023
简繁体汉字转拼音的项目,解决多音字的问题。ElasticSearch、solr 的拼音分词工具

pinyin-plus 汉字转拼音的库,有如下特点 拼音数据基于 cc-cedict 、kaifangcidian 开源词库 基于拼音词库的数据初始化分词引擎进行分词,准确度高,解决多音字的问题 支持繁体字 支持自定义词库,词库格式同 cc-cedict 字典格式 api 简单,分为普通模式、索引模

TapTap 103 Dec 25, 2022
CUBA Platform is a high level framework for enterprise applications development

Java RAD framework for enterprise web applications Website | Online Demo | Documentation | Guides | Forum CUBA Platform is a high level framework for

CUBA Platform 1.3k Jan 1, 2023