Skip to main content

Elasticsearch Explained: Trying to create too many scroll contexts. Must be less than or equal to 500

Hello Everyone, today we are going to discuss the following Error in Elasticsearch

"Trying to create too many scroll contexts. Must be less than or equal to: [500]. This limit can be set by changing the [search.max_open_scroll_context] setting"

Let's try to understand why this occurs and how we can solve it.





When & Why this error trigger?


As the title indicates, this error will come if you are using scroll API and especially multiple scrolls

Scrolls are expensive to run concurrently and reserves the resources for that particular time.

For each scroll ID, there is a unique point-in-time view of the current set of segments preserved for that scroll. This hangs on to files and related caches that would otherwise be removed by the constant segment rewriting that happens while indexing is active. This is why it is especially resource-intensive to do concurrently.

Let's dive a little deeper.

In order to use scrolling, the initial search request should specify the scroll parameter in the query string, which tells Elasticsearch how long it should keep the “search context” alive. Its value (e.g. 1m) does not need to be long enough to process all data — it just needs to be long enough to process the previous batch of results. Each scroll request (with the scroll parameter) sets a new expiry time. If a scroll request doesn’t pass in the scroll parameter, then the search context will be freed as part of that scroll request.



POST /twitter/_search?scroll=1m
{
    "size": 100,
    "query": {
        "match" : {
               "title" : "elasticsearch"
          }
    }
}


Normally, the background merge process optimizes the index by merging together smaller segments to create new bigger segments, at which time the smaller segments are deleted. This process continues during scrolling, but an open search context prevents the old segments from being deleted while they are still in use. This is how Elasticsearch is able to return the results of the initial search request, regardless of subsequent changes to documents.


How to Prevent & Fix it?

Now we know that concurrent scroll requests with more scroll time (60m) can use resources extensively and cause this issue.

In case you got this error and are not able to perform any update or delete operations on your cluster, either clear your scrolls or increase the size of max_open_scroll_context for a limited amount of time, till your scrolls are not cleared automatically within the specified time. It's not a recommended solution but to avoid any data loss or ongoing scroll APIs, this can be your savior.


Clear Scroll API:

Search contexts are automatically removed when the scroll timeout has been exceeded. However keeping scrolls open has a cost, and should be explicitly cleared as soon as the scroll is not being used anymore using the clear-scroll API:


DELETE /_search/scroll
{
 "scroll_id" : 
    "DXF1ZXJ5QW5kRmV0Y2gBAAAAAAAAAD4WYm9laVYtZndUQlNsdDcwakFMNjU1QQ=="

}



Increase the size of max_open_scroll_context

To prevent the against issues caused by having too many scrolls open, you can limit the number of open scrolls per node with the search.max_open_scroll_context cluster setting (defaults to unlimited).


To check default size, please use this command:

http://127.0.0.1:9200/_cluster/settings?include_defaults=true&pretty=true


To update max_open_scroll_context size, you can use the following command.

curl -X PUT http://ip:9200/_cluster/settings -H 'Content-Type: application/json' -d'{
    "persistent" : {
        "search.max_open_scroll_context": 5000
    },
    "transient": {
        "search.max_open_scroll_context": 5000
    }
}'


Note: Don't forget to set it back to the lower number, once scroll time is elapsed already.


Thanks! Enjoy Programming!!


Reference Links:

https://www.elastic.co/guide/en/elasticsearch/reference/6.8/search-request-scroll.html


Comments

Popular posts from this blog

Odoo/OpenERP: one2one relational field example

one2one relational field is deprecated in OpenERP version>5 but you can achieve the same using many2one relational field. You can achieve it in following two ways : 1) using many2one field in both the objects ( http://tutorialopenerp.wordpress.com/2014/04/23/one2one/ ) 2)  using inheritance by deligation You can easily find the first solution with little search over internet so let's start with 2nd solution. Scenario :  I want to create a one2one relation between two objects of openerp hr.employee and hr.employee.medical.details What I should do  i. Add _inherits section in hr_employee class ii. Add field medical_detail_id in hr_employee class class hr_employee(osv.osv):     _name = 'hr.employee'     _inherits = {' hr.employee.medical.details ': "medical_detail_id"}     _inherit = 'hr.employee'         _columns = {              'emp_code':fields.char('Employee Code', si

How to draw Dynamic Line or Timeseries Chart in Java using jfreechart library?

Today we are going to write a code to draw a dynamic timeseries-cum-line chart in java.   The only difference between simple and dynamic chart is that a dynamic event is used to create a new series and update the graph. In out example we are using timer which automatically calls a funtion after every 1/4 th second and graph is updated with random data. Let's try with the code : Note : I had tried my best to provide complete documentation along with code. If at any time anyone have any doubt or question please post in comments section. DynamicLineAndTimeSeriesChart.java import java.awt.BorderLayout; import java.awt.Color; import java.awt.event.ActionEvent; import java.awt.event.ActionListener; import javax.swing.Timer; import javax.swing.JPanel; import org.jfree.chart.ChartFactory; import org.jfree.chart.ChartPanel; import org.jfree.chart.JFreeChart; import org.jfree.chart.axis.ValueAxis; import org.jfree.chart.plot.XYPlot; import

Flickr and OAuth

What is Flickr? I think you landed on this page because you know what Flickr is, so let’s come to the point and discuss about the API. Why am explaining? Although each and everything, about API, is well documented on Flickr website here , I'm just trying to explain the whole process by dividing it into small parts. Old Authentication API The current Flickr authorization scheme is not the first one it used. In the early days of Flickr, users granted the power to an app to act on their behalf by giving  the apps their Flickr username and password. Doing so meant that in order to revoke  an app’s permission, users would have to change their Flickr password. Of course, doing that would also instantly revoke permissions of other third-­party apps with knowledge of the user’s password. The new authorization scheme is meant to correct obvious problems with the old scheme. Why should you as a user have to use your Flickr password for anything other than your dealings with Flickr?