yacy_search_server

History

Michael Peter Christen 9ad1d8dde2 complete redesign of crawl queue monitoring: do not look at a ready-prepared crawl list but at the stacks of the domains that are stored for balanced crawling. This affects also the balancer since that does not need to prepare the pre-selected crawl list for monitoring. As a effect: - it is no more possible to see the correct order of next to-be-crawled links, since that depends on the actual state of the balancer stack the next time another url is requested for loading - the balancer works better since the next url can be selected according to the current situation and not according to a pre-selected order.		13 years ago
..
html	complete redesign of crawl queue monitoring: do not look at a	13 years ago
images	smaller bug fixes for search behavior; should produce less unnecessary removals and an exact number of results as shown in counter	14 years ago
xml	some last-minute performance hacks	14 years ago
bzipParser.java	added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.	14 years ago
csvParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
docParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
genericParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
gzipParser.java	added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.	14 years ago
htmlParser.java	performance hacks	14 years ago
mmParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
odtParser.java	set a limit to CharBuffer object size to fight against bad/too large	13 years ago
ooxmlParser.java	set a limit to CharBuffer object size to fight against bad/too large	13 years ago
pdfParser.java	memory hacks	13 years ago
pptParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
psParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
rssParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
rtfParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
sevenzipParser.java	added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.	14 years ago
sidAudioParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
sitemapParser.java	better abstraction of http client identification	14 years ago
swfParser.java	some last-minute performance hacks	14 years ago
tarParser.java	added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.	14 years ago
torrentParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
vcfParser.java	some last-minute performance hacks	14 years ago
vsdParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
xlsParser.java	- enhanced html parser: recognized much more details in the content	14 years ago
zipParser.java	added new configuration property "crawler.embedLinksAsDocuments". If this is switched on (this is default now), the all embedded image, audio and video links from all parsed documents are added to the search index as individual document. This will increase the search index size dramatically but will also enable us to create a much faster image, audio and video search. If the flag is switched on, the index entries are also stored to a solr index, if this is also enabled.	14 years ago