We name our datasources after the sites they're indexing, which in ElasticSearch does'nt realy work well.
ElasticSearch sees the // and : from http:// and cuts them up
so http://thesun.co.uk becomes 4 :
http
thesun
co
uk
or if you use file share \\myserver\logs becomes
myserver
logs
But if you set the datasource field as " not analyzed " , then the facet search will work and not cut them up at the special chars
curl -XPUT "http://localhost:9200/crawllog/log/_mapping" -d ' {
"log" : {
"properties": {
"Url":{"type":"string"},
"ErrorMessage" : { "type" : "string" },
"Date":{"type":"string"},
"ErrorDescription":{"type":"string"},
"ErrorId":{"type":"long"},
"ContentSource":{"type":"string","index":"not_analyzed"}}}}}'
No comments:
Post a Comment