Welcome!

Blog Feed Post

Solr V2 API – Quick Look

We are all used to the Solr API that has been present in Solr from its beginnings. We send the data using HTTP protocol, we include all parameters in the URL itself, and we are bound to that. Some people loved this, some not so much.  Staring with Solr 6.5 we now have a new, self-documenting API called v2. Let’s look at this new API, how to use it and how it is different from the old fashioned Solr API.



Introducing the New Solr API

Let’s just immediately start working with the new API.  It’s probably the best way to learn about it.  Here’s the most basic request we can execute against the new Solr API:

$ curl http://localhost:8983/v2

First thing you’ll notice is that the new API is not available under the usual Solr context – there is no /solr in the URL. Instead, we talk to it using the /v2 URI path. This lets Solr have two separate sets of APIs in the same instance of Solr and have a space for new APIs introduced in the future. The response of the above call looks as follows:

{"responseHeader":{"status":0,"QTime":0},"collections":["gettingstarted"]} 

As we can see, the new API returns the same old standard response header and the list of collections that are present in the cluster. The call to the old API to get this same info looks like this:

$ curl 'http://localhost:8983/solr/admin/collections?action=LIST'

This time, the response is returned in the XML, but the information is the same:

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int name="QTime">0</int></lst><arr name="collections"><str>gettingstarted</str></arr>
</response>

Of course, in both cases we can pretty-print the results by adding indent=true to the request, like this:

$ curl 'http://localhost:8983/v2?indent=true'
{
  "responseHeader":{
    "status":0,
    "QTime":0},
  "collections":["gettingstarted"]}

We can also change the response type when using the old API, so that the returned response is very similar:

$ curl 'http://localhost:8983/solr/admin/collections?action=LIST&wt=json&indent=true'
 {
   "responseHeader":{
     "status":0,
     "QTime":0},
   "collections":["gettingstarted"]}
 

So, why is that different?

First things first – the new API is self-documenting. That means that we can get the list of information and options we have when using the new API. By adding the _introspect endpoint to any API v2 calls we can get the list of possible operations using that endpoint. For example:

$ curl 'http://localhost:8983/v2/collections/_introspect?indent=true'

Or even better, we can use c instead of collections to shorten the call to look as follows:

$ curl 'http://localhost:8983/v2/c/_introspect?indent=true'

The response returned by Solr is rather large, so we’ll just show a portion of that, but you can see that the API contains not only the response with the data we are looking for, but also some additional descriptions which make the API self-documenting:

{
  "responseHeader":{
    "status":0,
    "QTime":2},
  "spec":[{
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api6",
      "description":"Deletes a collection.",
      "methods":["DELETE"],
      "url":{"paths":["/collections/{collection}",
          "/c/{collection}"]}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1",
      "description":"Create collections and collection aliases, backup or restore collections, and delete collections and aliases.",
      "methods":["POST"],
      "url":{"paths":["/collections",
          "/c"]},
.
.
.

As you can already tell, the new v2 API is more modern and most of the parameters are sent in the request body, instead of the URI. Once the new v2 API covers all the functionality of the old API, SolrJ and Solr admin will start using the new API and after that it is expected that the old API will be deprecated and then removed. Because of that it might be a good to start getting used to the new API right away, so you have easier learning curve and faster adoption when you finally decide to move to the new way of talking to Solr.

V2 Solr API Capabilities

The response returned by the commands that we’ve seen above is large, so I encourage you to check the response yourself. What I would like to do is provide you with a brief description on what can be done using the v2 API:

  • Creating, deleting and managing collections
  • Creating aliases, backing up and restoring collections
  • Sending data
  • Updating collection configuration
  • Managing schema and managed resources
  • Using request handlers – for example running search requests
  • Adding and removing replicas
  • Managing cores
  • Performing overseer operations
  • Managing node roles
  • Setting cluster properties
  • Uploading and downloading blobs and metadata

As you can see we can already do lots of things with the new API and because the API is self-documenting we can quickly, without searching for the documentation, see how to work with it. For example, if we wanted to see what we can do with shards, we could run a command like this (we’ll use one of the out of the box collections that come with Solr called gettingstarted):

$ curl 'localhost:8983/v2/c/gettingstarted/shards/_introspect?indent=true'

The response shows us what we can do with “/shards” API:

{
  "spec":[{
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api7",
      "description":"Deletes a shard by unloading all replicas of the shard, removing it from clusterstate.json, and by default deleting the instanceDir and dataDir. Only inactive shards or those which have no range for custom sharding will be deleted.",
      "methods":["DELETE"],
      "url":{
        "paths":["/collections/{collection}/shards/{shard}",
          "/c/{collection}/shards/{shard}"],
        "params":{
          "deleteInstanceDir":{
            "type":"boolean",
            "description":"By default Solr will delete the entire instanceDir of each replica that is deleted. Set this to false to prevent the instance directory from being deleted."},
          "deleteDataDir":{
            "type":"boolean",
            "description":"y default Solr will delete the dataDir of each replica that is deleted. Set this to false to prevent the data directory from being deleted."},
          "async":{
            "type":"string",
            "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined. This command can be long-running, so running it asynchronously is recommended."}}}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API",
      "description":"Allows you to create a shard, split an existing shard or add a new replica.",
      "methods":["POST"],
      "url":{"paths":["/collections/{collection}/shards",
          "/c/{collection}/shards"]},
      "commands":{
        "split":{
          "type":"object",
          "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api3",
          "description":"Splits an existing shard into two or more new shards. During this action, the existing shard will continue to contain the original data, but new data will be routed to the new shards once the split is complete. New shards will have as many replicas as the existing shards. A soft commit will be done automatically. An explicit commit request is not required because the index is automatically saved to disk during the split operation. New shards will use the original shard name as the basis for their names, adding an underscore and a number to differentiate the new shard. For example, 'shard1' would become 'shard1_0' and 'shard1_1'. Note that this operation can take a long time to complete.",
          "properties":{
            "shard":{
              "type":"string",
              "description":"The name of the shard to be split."},
            "ranges":{
              "description":"A comma-separated list of hexadecimal hash ranges that will be used to split the shard into new shards containing each defined range, e.g. ranges=0-1f4,1f5-3e8,3e9-5dc. This is the only option that allows splitting a single shard into more than 2 additional shards. If neither this parameter nor splitKey are defined, the shard will be split into two equal new shards.",
              "type":"string"},
            "splitKey":{
              "description":"A route key to use for splitting the index. If this is defined, the shard parameter is not required because the route key will identify the correct shard. A route key that spans more than a single shard is not supported. If neither this parameter nor ranges are defined, the shard will be split into two equal new shards.",
              "type":"string"},
            "coreProperties":{
              "type":"object",
              "documentation":"https://cwiki.apache.org/confluence/display/solr/Defining+core.properties",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set, the node name, the data directory, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined. This command can be long-running, so running it asynchronously is recommended."}}},
        "create":{
          "type":"object",
          "properties":{
            "nodeSet":{
              "description":"Defines nodes to spread the new collection across. If not provided, the collection will be spread across all live Solr nodes. The names to use are the 'node_name', which can be found by a request to the cluster/nodes endpoint.",
              "type":"array",
              "items":{"type":"string"}},
            "shard":{
              "description":"The name of the shard to be created.",
              "type":"string"},
            "coreProperties":{
              "type":"object",
              "documentation":"https://cwiki.apache.org/confluence/display/solr/Defining+core.properties",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set, the node name, the data directory, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined."}},
          "required":["shard"]},
        "add-replica":{
          "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api_addreplica",
          "description":"",
          "type":"object",
          "properties":{
            "shard":{
              "type":"string",
              "description":"The name of the shard in which this replica should be created. If this parameter is not specified, then '_route_' must be defined."},
            "_route_":{
              "type":"string",
              "description":"If the exact shard name is not known, users may pass the _route_ value and the system would identify the name of the shard. Ignored if the shard param is also specified. If the 'shard' parameter is also defined, this parameter will be ignored."},
            "node":{
              "type":"string",
              "description":"The name of the node where the replica should be created."},
            "instanceDir":{
              "type":"string",
              "description":"An optional custom instanceDir for this replica."},
            "dataDir":{
              "type":"string",
              "description":"An optional custom directory used to store index data for this replica."},
            "coreProperties":{
              "type":"object",
              "documentation":"https://cwiki.apache.org/confluence/display/solr/Defining+core.properties",
              "description":"Allows adding core.properties for the collection. Some examples of core properties you may want to modify include the config set and the node name, among others.",
              "additionalProperties":true},
            "async":{
              "type":"string",
              "description":"Defines a request ID that can be used to track this action after it's submitted. The action will be processed asynchronously when this is defined."}},
          "required":["shard"]}}},
    {
      "documentation":"https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api1",
      "description":"Lists all collections, with details on shards and replicas in each collection.",
      "methods":["GET"],
      "url":{"paths":["/collections/{collection}",
          "/c/{collection}",
          "/collections/{collection}/shards",
          "/c/{collection}/shards",
          "/collections/{collection}/shards/{shard}",
          "/c/{collection}/shards/{shard}",
          "/collections/{collection}/shards/{shard}/{replica}",
          "/c/{collection}/shards/{shard}/{replica}"]}}],
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "WARNING":"This response format is experimental.  It is likely to change in the future.",
  "availableSubPaths":{
    "/c/gettingstarted/shards/{shard}/{replica}":["DELETE",
      "GET"],
    "/c/gettingstarted/shards/{shard}":["DELETE",
      "POST",
      "GET"]}}

As you can see, the API provides us all information about itself that we need – the HTTP verbs that we can use, the parameters that can be present, and finally their description, so that we know what each parameter is all about. We can also get information about the given command and/or the HTTP verb, for example:

$ curl 'http://localhost:8983/v2/c/gettingstarted/shards/shard2/_introspect?method=DELETE&indent=true'

Judging from the response further above we could, for example, delete a replica by running the following command:

$ curl -XDELETE 'localhost:8983/v2/c/gettingstarted/shards/shard2/core_node3'

The response to the last command would look as follows:

{"responseHeader":{"status":0,"QTime":278},"success":{"192.168.1.15:7574_solr":{"responseHeader":{"status":0,"QTime":69}}}}

Which means that the replica for the shard2 has been removed, which can also be checked via the Solr admin panel:

Solr V2 - Solr admin panelhttps://sematext.com/wp-content/uploads/2017/05/Solr-V2-1-300x41.png 300w, https://sematext.com/wp-content/uploads/2017/05/Solr-V2-1-768x104.png 768w" sizes="(max-width: 975px) 100vw, 975px" />

We can also add replicas using the new API and this operation will be good to illustrate how to pass parameters with the request. Let’s add the replica to shard2 by using the following command:

$ curl -XPOST 'localhost:8983/v2/c/gettingstarted/shards/' -H 'Content-type:application/json' -d '{
 "add-replica" : {
  "shard" : "shard2",
  "node" : "192.168.1.15:7574_solr"
 }
}'

We added the header identifying the content type of the body and we provided the add-replica command along with two parameters – shard and node. The shard parameter specifies which part of the collection we are interested in and the node property tells Solr, on which Solr instance the replica should be created. Please note that the node address is not only the IP address also include the port and usual _solr part.

The response would look as follows:

{"responseHeader":{"status":0,"QTime":1329},"success":{"192.168.1.15:7574_solr":{"responseHeader":{"status":0,"QTime":1318},"core":"gettingstarted_shard2_replica2"}}}

And would result in a new replica being added:

Solr V2 https://sematext.com/wp-content/uploads/2017/05/solr-V2-2-300x45.png 300w, https://sematext.com/wp-content/uploads/2017/05/solr-V2-2-768x115.png 768w" sizes="(max-width: 975px) 100vw, 975px" />

What’s Next

The API we just introduced is still work in progress. We are still missing a few things, but the V2 API is fairly new, so we can expect lots of changes in the next few Solr versions.

Want to learn more about Solr? Subscribe to our blog or follow @sematext. If you need any help with Solr / SolrCloud – don’t forget that we provide Solr Consulting, Solr Production Support, and offer Solr Training!

Read the original blog entry...

More Stories By Sematext Blog

Sematext is a globally distributed organization that builds innovative Cloud and On Premises solutions for performance monitoring, alerting and anomaly detection (SPM), log management and analytics (Logsene), and search analytics (SSA). We also provide Search and Big Data consulting services and offer 24/7 production support for Solr and Elasticsearch.

Latest Stories
In the world of DevOps there are ‘known good practices’ – aka ‘patterns’ – and ‘known bad practices’ – aka ‘anti-patterns.' Many of these patterns and anti-patterns have been developed from real world experience, especially by the early adopters of DevOps theory; but many are more feasible in theory than in practice, especially for more recent entrants to the DevOps scene. In this power panel at @DevOpsSummit at 18th Cloud Expo, moderated by DevOps Conference Chair Andi Mann, panelists discussed...
Elon Musk is among the notable industry figures who worries about the power of AI to destroy rather than help society. Mark Zuckerberg, on the other hand, embraces all that is going on. AI is most powerful when deployed across the vast networks being built for Internets of Things in the manufacturing, transportation and logistics, retail, healthcare, government and other sectors. Is AI transforming IoT for the good or the bad? Do we need to worry about its potential destructive power? Or will we...
Cloud-based disaster recovery is critical to any production environment and is a high priority for many enterprise organizations today. Nearly 40% of organizations have had to execute their BCDR plan due to a service disruption in the past two years. Zerto on IBM Cloud offer VMware and Microsoft customers simple, automated recovery of on-premise VMware and Microsoft workloads to IBM Cloud data centers.
Many organizations adopt DevOps to reduce cycle times and deliver software faster; some take on DevOps to drive higher quality and better end-user experience; others look to DevOps for a clearer line-of-sight to customers to drive better business impacts. In truth, these three foundations go together. In this power panel at @DevOpsSummit 21st Cloud Expo, moderated by DevOps Conference Co-Chair Andi Mann, industry experts will discuss how leading organizations build application success from all...
SYS-CON Events announced today that SIGMA Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. uLaser flow inspection device from the Japanese top share to Global Standard! Then, make the best use of data to flip to next page. For more information, visit http://www.sigma-k.co.jp/en/.
SYS-CON Events announced today that Daiya Industry will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Daiya Industry specializes in orthotic support systems and assistive devices with pneumatic artificial muscles in order to contribute to an extended healthy life expectancy. For more information, please visit https://www.daiyak...
The last two years has seen discussions about cloud computing evolve from the public / private / hybrid split to the reality that most enterprises will be creating a complex, multi-cloud strategy. Companies are wary of committing all of their resources to a single cloud, and instead are choosing to spread the risk – and the benefits – of cloud computing across multiple providers and internal infrastructures, as they follow their business needs. Will this approach be successful? How large is the ...
Your clients expect transactions to never fail, cloud access to be fast and always on, and their data to be protected - no exceptions. Hear about how Secure Service Container (SSC), an IBM-exclusive open technology, enables secure building and hosting of next-generation applications, both cloud and on-premises. SSC protects the full stack from external and insider threats, allows automatic encryption of data in-flight and at-rest, and is tamper-resistant during installation and runtime – with no...
With major technology companies and startups seriously embracing Cloud strategies, now is the perfect time to attend 21st Cloud Expo October 31 - November 2, 2017, at the Santa Clara Convention Center, CA, and June 12-14, 2018, at the Javits Center in New York City, NY, and learn what is going on, contribute to the discussions, and ensure that your enterprise is on the right path to Digital Transformation.
SYS-CON Events announced today that B2Cloud will exhibit at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. B2Cloud specializes in IoT devices for preventive and predictive maintenance in any kind of equipment retrieving data like Energy consumption, working time, temperature, humidity, pressure, etc.
SYS-CON Events announced today that Interface Corporation will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Interface Corporation is a company developing, manufacturing and marketing high quality and wide variety of industrial computers and interface modules such as PCIs and PCI express. For more information, visit http://www.i...
SYS-CON Events announced today that Massive Networks, that helps your business operate seamlessly with fast, reliable, and secure internet and network solutions, has been named "Exhibitor" of SYS-CON's 21st International Cloud Expo ®, which will take place on Oct 31 - Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. As a premier telecommunications provider, Massive Networks is headquartered out of Louisville, Colorado. With years of experience under their belt, their team of...
SYS-CON Events announced today that NetApp has been named “Bronze Sponsor” of SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. NetApp is the data authority for hybrid cloud. NetApp provides a full range of hybrid cloud data services that simplify management of applications and data across cloud and on-premises environments to accelerate digital transformation. Together with their partners, NetApp em...
SYS-CON Events announced today that Mobile Create USA will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Mobile Create USA Inc. is an MVNO-based business model that uses portable communication devices and cellular-based infrastructure in the development, sales, operation and mobile communications systems incorporating GPS capabi...
SYS-CON Events announced today that Nihon Micron will exhibit at the Japan External Trade Organization (JETRO) Pavilion at SYS-CON's 21st International Cloud Expo®, which will take place on Oct 31 – Nov 2, 2017, at the Santa Clara Convention Center in Santa Clara, CA. Nihon Micron Co., Ltd. strives for technological innovation to establish high-density, high-precision processing technology for providing printed circuit board and metal mount RFID tags used for communication devices. For more inf...