How to configure Solr and Tika on Tomcat Server

| posted in: Server Configurations | 2


Solr Configuration Document

Note:  To configure solr on tomcat we have stable version apache-solr-1.4.1.war, but it has some missing files like TikaEntityProcessor.class, which is necessary to configure tika on tomcat. so here, to work with Solr and tika on tomcat server we have used solr developing version apache-solr-3.2.

*So apache-solr-3.0.war is the working version wich was used in FFC Project.

How to Configure Solr on Tomcat Server,


A)  Follow below  link to download Apache-Solr 1.4.1(stable version)


B)  Follow below  link to download Apache-Solr 3.2 (Developing version)

Install Tomcat.

  •  On Ubuntu:

sudo apt-get install tomcat6

  •  On CentOS:

sudo yum install -y tomcat5

Step 2:  Create /var/solr :

mkdir –p /var/sol

Step 3: Copy the Example Configuration

 For apache-solr-1.4.1:

cp –R apache-solr-1.4.1/example/solr/* /var/solr

For apache-solr-3.2:

cp –R apache-solr-3.2-*/example/solr/* /var/solr

Note: apache-solr-3.2 is not a stable version. Please check the file names, these files  are  named as apache-solr-3.2-date_time format (ex: apache-solr-3.2-2011-03-13_05-27-19).So please remove date and time and use the files for your better convenience.

Step 4:
  Copy the .war:

For apache-solr-1.4.1

cp apache-solr-1.4.0/dist/apache-solr-1.4.1.war /var/solr/solr.war

               For apache-solr-3.2

cp apache-solr-3.2-*/dist/apache-solr-3.2-*.war /var/solr/solr.war

Step 5: Set Permissions

On Ubuntu:

chown –R tomcat6 /var/solr/

On CentOS:

chown –R tomcat /var/solr/

Step 6: Copy below in the context configuration.

               On Ubuntu:


<Context docBase=”/var/solr/solr.war” debug=”0” privileged=”true” allowLinking=”true” crossContext=”true”>

<Environment name=”solr/home” type=”java.lang.String” value=”/var/solr” override=”true” />


               On CentOS:


<Context docBase=”/var/solr/solr.war” debug=”0” privileged=”true” allowLinking=”true” crossContext=”true”>

<Environment name=”solr/home” type=”java.lang.String” value=”/var/solr” override=”true” />


Step 7: On CentOS, remove some compatibility classes:

cd /usr/share/tomcat5/common/endorsed/

rm *

Step 8: Changing the Port No for Tomcat Server, go to….

On Ubuntu, this is in /etc/tomcat6/conf/server.xml.

On CentOS, this is in /etc/tomcat5/conf/server.xml.

And search for below comment line or tag and change the port number to 8983 or any thing you want.

<! – Define a non-SSL HTTP/1.1 Connector on port 8080 …..

<Connector port=”8080”…../>

Step 9: On CentOS, set the data directory:



Step 10: restart tomcat Server

Step 11: open any browser and type http://localhost:8080/solr

Note: Basic configuration completes after Step10.

Note: follow this URL if you need more clarity:


For Configuring Database structure to Solr

Go to schema.xml file and Configure the database structure according to your data base, here we followed some sample code based on FFC data base configurations.

<field name=”id” type=”string” indexed=”true” stored=”true” required=”true”/>

<field name=”itemid” type=”int” indexed=”true” stored=”true” required=”true”/>

<field name=”itemtype” type=”string” indexed=”true” stored=”true” required=”true”/>

<field name=”fullName” type=”string” indexed=”true” stored=”true” required=”true”/>

<field name=”otherName” type=”string” indexed=”true” stored=”true”/>

<field name=”interests” type=”text” indexed=”true” stored=”false”/>

<field name=”training” type=”text” indexed=”true” stored=”false”/>

<field name=”organization” type=”string” indexed=”true” stored=”false”/>

<field name=”specialities” type=”text” indexed=”true” stored=”false”/>

<field name=”address1” type=”text” indexed=”true” stored=”false”/>

<field name=”city” type=”string” indexed=”true” stored=”false”/>

<field name=”state” type=”string” indexed=”true” stored=”false”/>

<field name=”country” type=”string” indexed=”true” stored=”false”/>

<field name=”zipcode” type=”string” indexed=”true” stored=”false”/>

<field name=”entitytype” type=”string” indexed=”true” stored=”false”/>

<field name=”industrytype” type=”string” indexed=”true” stored=”false”/>

<field name=”policy” type=”string” indexed=”true” stored=”false”/>

<field name=”missionstatement” type=”string” indexed=”true” stored=”false”/>

<field name=”description” type=”text” indexed=”true” stored=”false” multiValued=”true”/>

<field name=”categoryId” type=”string” indexed=”true” stored=”false”/>

<field name=”chapternames” type=”string” indexed=”true” stored=”false”/>

<field name=”text” type=”text” indexed=”true” stored=”false” multiValued=”true”/>

<field name=”createdDate” type=”date” indexed=”true” required=”true” stored=”true”/>

<field name=”products” type=”string” indexed=”true” stored=”false” multiValued=”true”/>

<field name=”photos” type=”string” indexed=”true” stored=”false” multiValued=”true”/>



<solrQueryParser defaultOperator=”OR”/>

<copyField source=”fullName” dest=”text”/>

<copyField source=”otherName” dest=”text”/>

<copyField source=”description” dest=”text”/>

<copyField source=”products” dest=”text”/>

<copyField source=”photos” dest=”text”/>

<copyField source=”interests” dest=”text”/>

<copyField source=”training” dest=”text”/>

<copyField source=”specialities” dest=”text”/>

<copyField source=”policy” dest=”text”/>

<copyField source=”missionstatement” dest=”text”/>


Step 12:  create file with 777 permissions in the location /solr/conf/


Configuring Tika on Solr

 To Configure Tika on Solr follow below steps,

 Step 1: Create a lib directory in /var/solr/

Step 2: Copy all the jar files from the paths /apache-solr-3.2/contrib/extraction/lib and also from /apache-solr-3.2/dist here apache-solr-dataimporthandler-extras-3.2.jar, apache-solr-dataimporthandler-3.2.jar these two jar files are important for configuring Tika on Solr.

Note: Don’t forgot to place mysql-connector-java-5.0.8-bin file in the lib file this is a mysqlJdbc driver supporting jar file

apache-solr-dataimporthandler-extras-3.2.jar file contains TikaEntityProcessor.class file.

 Create data-config.xml file in /var/solr and configure tika configuration details in this file.

<dataSource name=”ds-db” driver=”com.mysql.jdbc.Driver”       url=”jdbc:mysql://” batchSize=”-1″ user=”ffcdemo” password=”aGaqxC4jSSBrLjKn” readOnly=”true” encoding=”UTF-8″/>

<dataSource type=”BinFileDataSource” name=”bin” />

<document name=”products”>

<entity dataSource=”ds-db” name=”item” query=”select   group_id,group_title,description,DATE_FORMAT(created_date, ‘%Y-%m-%dT%H:%i:%sZ’) as createdDate,group_status, ‘GROUP’ as itemtype,CONCAT(‘GROUP’,CAST(group_id AS CHAR CHARACTER SET utf8 )) as id from collaboration_groups where group_status=1″ deltaImportQuery=”select group_id,group_title,description,DATE_FORMAT(created_date, ‘%Y-%m-%dT%H:%i:%sZ’) as createdDate,group_status,CONCAT(‘GROUP’,CAST(group_id AS CHAR CHARACTER SET utf8 )) as id,’GROUP’ as itemtype  from collaboration_groups where group_status=1 and group_id=’${}'” deltaQuery=”select group_id from collaboration_groups where group_status=1 and  updated_date &gt; ‘${dataimporter.last_index_time}'” deletedPkQuery=”select CONCAT(‘GROUP’,CAST(group_id AS CHAR CHARACTER SET utf8 )) as id  from  collaboration_groups  where group_status = 0 and updated_date &gt; ‘${dataimporter.last_index_time}'” >
<field column=”id” name=”id” />

<field column=”group_id” name=”itemid” />

<field column=”itemtype” name=”itemtype” />

<field column=”group_title” name=”fullName” />

<field column=”description” name=”description”/>

<field column=”createdDate” name=”createdDate”/>

<entity name=”tika-test” processor=”TikaEntityProcessor” url=”” format=”text”  dataSource=”bin”>

<field column=”text” />




2 Responses

Leave a Reply