Deleting the Document
To delete documents from the index of Apache Solr, we need to specify the ID’s of the documents to be deleted between the <delete></delete> tags.
<delete> <id>003</id> <id>005</id> <id>004</id> <id>002</id> </delete>
Here, this XML code is used to delete the documents with ID’s 003 and 005. Save this code in a file with the name delete.xml.
If you want to delete the documents from the index which belongs to the core named my_core, then you can post the delete.xml file using the post tool, as shown below.
[Hadoop@localhost bin]$ ./post -c my_core delete.xml
On executing the above command, you will get the following output.
/home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = my_core -Ddata = files org.apache.Solr.util.SimplePostTool delete.xml SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/my_core/update... Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots, rtf,htm,html,txt,log POSTing file delete.xml (application/xml) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/my_core/update... Time spent: 0:00:00.179
Verification
Visit the homepage of the of Apache Solr web interface and select the core as my_core. Try to retrieve all the documents by passing the query “:” in the text area q and execute the query. On executing, you can observe that the specified documents are deleted.
Deleting a Field
Sometimes we need to delete documents based on fields other than ID. For example, we may have to delete the documents where the city is Chennai.
In such cases, you need to specify the name and value of the field within the <query></query> tag pair.
<delete> <query>city:Chennai</query> </delete>
Save it as delete_field.xml and perform the delete operation on the core named my_core using the post tool of Solr.
[Hadoop@localhost bin]$ ./post -c my_core delete_field.xml
On executing the above command, it produces the following output.
/home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = my_core -Ddata = files org.apache.Solr.util.SimplePostTool delete_field.xml SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/my_core/update... Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots, rtf,htm,html,txt,log POSTing file delete_field.xml (application/xml) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/my_core/update... Time spent: 0:00:00.084
Verification
Visit the homepage of the of Apache Solr web interface and select the core as my_core. Try to retrieve all the documents by passing the query “:” in the text area q and execute the query. On executing, you can observe that the documents containing the specified field value pair are deleted.
Deleting All Documents
Just like deleting a specific field, if you want to delete all the documents from an index, you just need to pass the symbol “:” between the tags <query></ query>, as shown below.
<delete> <query>*:*</query> </delete>
Save it as delete_all.xml and perform the delete operation on the core named my_core using the post tool of Solr.
[Hadoop@localhost bin]$ ./post -c my_core delete_all.xml
On executing the above command, it produces the following output.
/home/Hadoop/java/bin/java -classpath /home/Hadoop/Solr/dist/Solr-core 6.2.0.jar -Dauto = yes -Dc = my_core -Ddata = files org.apache.Solr.util.SimplePostTool deleteAll.xml SimplePostTool version 5.0.0 Posting files to [base] url http://localhost:8983/Solr/my_core/update... Entering auto mode. File endings considered are xml,json,jsonl,csv,pdf,doc,docx,ppt,pptx,xls,xlsx,odt,odp,ods,ott,otp,ots,rtf, htm,html,txt,log POSTing file deleteAll.xml (application/xml) to [base] 1 files indexed. COMMITting Solr index changes to http://localhost:8983/Solr/my_core/update... Time spent: 0:00:00.138
Verification
Visit the homepage of Apache Solr web interface and select the core as my_core. Try to retrieve all the documents by passing the query “:” in the text area q and execute the query. On executing, you can observe that the documents containing the specified field value pair are deleted.
Deleting all the documents using Java (Client API)
Following is the Java program to add documents to Apache Solr index. Save this code in a file with the name UpdatingDocument.java.
import java.io.IOException; import org.apache.Solr.client.Solrj.SolrClient; import org.apache.Solr.client.Solrj.SolrServerException; import org.apache.Solr.client.Solrj.impl.HttpSolrClient; import org.apache.Solr.common.SolrInputDocument; public class DeletingAllDocuments { public static void main(String args[]) throws SolrServerException, IOException { //Preparing the Solr client String urlString = "http://localhost:8983/Solr/my_core"; SolrClient Solr = new HttpSolrClient.Builder(urlString).build(); //Preparing the Solr document SolrInputDocument doc = new SolrInputDocument(); //Deleting the documents from Solr Solr.deleteByQuery("*"); //Saving the document Solr.commit(); System.out.println("Documents deleted"); } }
Compile the above code by executing the following commands in the terminal −
[Hadoop@localhost bin]$ javac DeletingAllDocuments [Hadoop@localhost bin]$ java DeletingAllDocuments
On executing the above command, you will get the following output.
Documents deleted