AEM 6.0 and above: AEM Oak queries and indexing

AEM Oak queries and indexing

Oak doesn't index contents by default.

Indexes are configured as nodes in repository under oak:index node with the type oak:queryIndexDefinition.

Property Index :

1. Go to crxde, under oak:index create a new node with name PropertyIndex and type oak:queryIndexDefinition
2. Set properties - type:property and propertyNames:propertynames of the node

Ordered Index :

Deprecated. Lucene index is to be used instead.

Lucene Full Index :

Index is update asynchronously by background thread.

1. Go to crxde, under oak:index create a new node with name LuceneIndex and type oak:queryIndexDefinition
2. Set properties - type:lucene and async:async of the node

Lucene Property Index :

1. Go to crxde, under oak:index create a new node with name LucenePropertyIndex and type oak:queryIndexDefinition
2. Set properties - type:lucene, async:async, fulltextEnabled:false, includePropertyNames:Nameofproperties of the node

You can provide custom analysed for lucene. Analyser can have tokenizer, tokenfilters and charfilters.

Solr can also be used. It can either be a embedded solr configuration or external solr server.

You can debug the AEM queries using query debugging or MBean output -

http://serveraddress:port/system/console/jmx

Best Practices :

Explain query tool can be used.
Traversal or prefetching results should be preferred over queries in components
Indexes should be in place
Instead of large queries, break down the queries in small and then combine the results as and when possible.
Set limits of the queries execution

-Doak.queryLimitInMemory=500000
-Doak.queryLimitReads=100000

Use lucene indexes wherever possible

Solr should be used when the server capacity is limited.

External solr should be used only when required as it introduces letency

Optimize indexes so that queries can run faster. Like using evaluatePathRestrictions, sorting, only put required contents in the indexes, define rules for node types in indexes, indexes for the paths where queries would run

If your nodestore is at a different place, do copyonread

Oak indexes should not be reindexed until oak index configuration has changed or binary is missing/corrupted.

Text Pre-Extraction of Binaries

Process of extracting and processing texts from binaries directly from data stores via an isolated process.

Useful when Lucene reindexing is done for the large volume of binaries with readable texts like pdfs, docs etc when full text search is expected.

Useful when supporting the new Lucene indexing.

AEM 6.0 and above

Pages

Saturday, June 20, 2020

AEM Oak queries and indexing

No comments:

Post a Comment

Some more AEM 6 Interview Questions for Architects

Report Abuse