Elasticsearch Indexing Jobs in Sugar 7

In this post by Jelle Vink, SugarCRM's Security Architect and resident Elasticsearch expert, offers an explanation of how the Sugar Job Scheduler and Job Queue affects Sugar 7's record indexing behavior.

Cron.php Execution

When cron.php is executed, there is a limit of how many jobs the driver executes and how long it will run. When either maximum is reached, the current cycle will terminate. The default maximums are 25 jobs and 1,800 seconds. Both can be changed in config_override.php:

$sugar_config['cron']['max_cron_jobs'] = 25;$sugar_config['cron']['max_cron_runtime'] = 1800;

There is also a minimum interval in minutes (which defaults to 1). If cron is executed multiple times in a row, it will only actually do something when the minimum interval is met. This can be changed to allow another cycle to be run again immediately after the previous finishes by using the following setting.

 $sugar_config['cron']['min_cron_interval'] = 0;



Elasticsearch Job Creation

There are a certain number of schedulers configured out of the box in Sugar 7. When cron is executed, the driver starts by executing schedulers that are due. These schedulers are not jobs themselves.  They simply create new jobs to be executed.  These jobs are then stored in job_queue table.

Once schedulers have created the necessary jobs, the driver starts executing the different jobs based on the order of creation, status, job delay and execution time.  For Elasticsearch there is one scheduler which is configured to run as often as possible - which means every time cron is executed. This scheduler will create a consumer job for every module for which there are queued Elasticsearch records in fts_queue table.

When a full reindex has been triggered by a Sugar Administrator, a consumer job for every FTS enabled module will be created and queued.

Always remember that your Elasticsearch jobs are not alone in the job queue.  There are other schedulers that create jobs like Email reminders, Database pruning, Check inbound email boxes, etc.  Jobs can also be created outside of schedulers via logic hooks or other custom code.

Job execution

As explained above, the cron driver will only run 25 jobs in the queue during each cycle. There is no guarantee that these are going to be Elasticsearch jobs.  Other jobs may also be waiting in the queue.  So there isn't any reason to give Elasticsearch jobs priority as we treat all jobs equally to guarantee that every job is executed eventually.

For Elasticsearch specific jobs there is also a maximum number of records that one Elasticsearch job will consume out of the queue for a given module. As explained above one Elasticsearch (consumer) job will only process one single module. The maximum of records an Elasticsearch consumer job will process for one module is by default 15,000. This can be configured using the following setting.

$sugar_config['search_engine']['max_bulk_query_threshold'] = 15000;

Effects on Elasticsearch indexing

In the demo data there is no single module which has a higher count of 15,000 records. The only limiting issue here is the amount jobs which are created which is in certain cases higher than the default 25. To get everything indexed for a full reindex, on average at least 2 cron runs are needed.

When testing Elasticsearch (full) reindexing after running cron, you should ensure that there are no records left in the fts_queue table. This is the only confirmation that all records are present in Elasticsearch.  A single cycle may not be enough to ensure all records have been indexed!

While it may cause an issue for Sugar Developers doing local development without cron setup, this is not an issue on a properly configured production system. For example, once a cron cycle stops after 25 jobs, the next cycle will happen soon - we typically recommend triggering cron every minute. That next run will pick up the next 25 jobs, etc, until indexing is complete.

Additional ElasticSearch fine tuning

The following config_override options are available for an admin to fine tune the performance of the indexing. This might change in the future as we are considering refactoring our queue out of the Sugar database. Below values are the defaults:

$sugar_config['search_engine']['max_bulk_query_threshold'] = 15000;$sugar_config['search_engine']['max_bulk_delete_threshold'] = 3000;$sugar_config['search_engine']['force_async_index'] = false;$sugar_config['search_engine']['max_bulk_threshold'] = 100;

Development / QA recommendations

We recommend adding the following to our deploy/automation to circumvent any issues regarding Elasticsearch (re)indexing and general cron usage.

All changes have to be done in config_override.php:

$sugar_config['cron']['max_cron_jobs'] = 500;$sugar_config['cron']['min_cron_interval'] = 0;

This will ensure that when a QA person or Sugar Developer executes cron.php multiple times in a short time frame, that cron will run immediately and will tend to clear the queue fully when there are a lot of jobs to be run.

  • Comment originally made by Matthew Marum.

    Hi Kat,

    You should not need to install 2 Sugar instances to run in HA mode. You shouldn't need to change unique_keys or things like this either. Typically, you'd use something like NFS to keep sugar instance files in sync across multiple web servers.

    I'd deploy Elasticsearch and your DB servers separately in a HA configuration.

    A HA deployment of Sugar could look something like this.

    http://support.sugarcrm.com/Resources/Environments/Sugar_7_On-Site_Sizing_Guide/#Deployment_Topology_3

  • Comment originally made by Kat.

    Hi Matt,

    I have two instances of sugarcrm that is on HA. I tried system index on the first server then on the second server but it seems that ES is only using one server. I also tried to change the unique_key in the config.php. I also setup unique cluster.name and node.name for each servers.

    Is there a possible to setup ES on both servers so that when the other server went down, it won't depend on the other server.

    By the way, ES is installed by each server.

    Thanks

  • Comment originally made by Matthew Marum.

    It may be that these records were added to fts_queue and then later deleted before they were indexed. It should be safe to truncate the fts_queue table to remove those records. By default, indexing happens in real time which means the fts_queue is used only for full re-indexes.

    "We recommend adding the following to our deploy/automation to circumvent any issues regarding Elasticsearch (re)indexing and general cron usage."

    We are just suggesting that you use these settings in your development/QA environments. If you have any automated deployment scripts or tools for your dev/QA environments then it may make sense to incorporate these settings into those.

  • Comment originally made by Matthew Marum.

    Hi Tam,

    I know this message was from a while ago but hopefully you will appreciate some of the FTS monitoring commands that were added as part of Sugar CLI in 7.7.x releases.

  • Comment originally made by Marc.

    thanks for the useful article. Which make the word a better place :)

    However, i have 2 questions:

    After active a full re-indexing for a quite long time, i noticed that in the job_queue table, the job "FTSConsumer Accounts" has been marked as "done" and "success", however, in the fts_queue table, there are still quite a lot accounts records left there with "processed" equals 0. Then i found those are the records that had been deleted(=1). My question is, what is sugar's intention? If those are not suppose to be indexed, which i believe so, why not exclude those from put into fts_queue table at the first place?

    Another question is about "$sugar_config['cron']['max_cron_jobs'] = 500;". When you say add it "deploy/automation", did you mean the live environment?