Searsia comes with a client and a server.
The Searsia Web client can be downloaded as searsiaclient.zip and unzipped on your local machine or web server. To use the web client:
API_TEMPLATE
in the second line of the file js/searsia.js
;index.html
in a web browser;The API template is a url with a placeholder for the query and possibly other parameters. Examples of API templates of on-line Searsia servers are the University of Twente's site search engine, and Dr. Sheet Music:
https://search.utwente.nl/searsia/index.json?q={searchTerms}&page={startPage?}
or
https://drsheetmusic.com/searsia/index.json?q={searchTerms}&page={startPage?}
If you run Searsia Server on your local machine (see next section), you can connect the client to your own server by setting the API template to something like:
http://localhost:16842/searsia/index?q={searchTerms?}&page={startPage?}
Download the Java server searsiaserver.jar and use the following command to start the server:
java -jar searsiaserver.jar
The server requires Java 8 or higher. Just like the web client, the java server needs another Searsia server's API template to connect to. We call the other server the mother, because your server will learn what it needs to know from the other server. Your server will display the following message:
Please provide mother's api template (use '-m').
Additionally, the server displays all server options.
Use the option -m
to provide the API template, for instance
one of the example templates under Client Options
above.
Your Searsia server copies the Searsia search engine definitions of the server it connects to (specified by the API template). Your server will have its own API template, which it will display at start up. For instance, connect to Dr. Sheet Music as follows:
java -jar searsiaserver.jar -m 'https://drsheetmusic.com/searsia/index.json?q={searchTerms}&page={startPage?}'
Searsia server v1.0.2
Starting: Dr. Sheet Music (index)
API template: http://localhost:16842/searsia/index?q={searchTerms}&page={startPage?}
Use Ctrl+c to stop.
Use the reported API template in your client as explained above to connect your client to your server.
A Searsia server provides access to many search engines. Together, the search
engines form a federation. Like a federation of countries that form a country
together, Searsia manages a federation of search engines that form a search
engine together. Each search engine in the federation has a unique identifier.
The identifier of the search engine in the examples above is 'index
'.
Suppose there is another search engine in the federation that is called
'didyoumean
', then we can access it by replacing 'index
'
by 'didyoumean
' as follows:
https://search.utwente.nl/searsia/didyoumean
If you look at the URL in your browser, you will see the JSON search engine definition below:
{ "resource": { "apitemplate": "https://search.utwente.nl/searsia/didyoumean.php?q={searchTerms}", "favicon": "https://search.utwente.nl/ut-icons/ut.png", "id": "didyoumean", "mimetype": "application/searsia+json", "name": "Did you mean:", "testquery": "test" }, "searsia": "v1.0.2" }
Searsia servers copy and share these search engine definition files.
We might now start a local Searsia server that serves search results
from 'didyoumean
' as follows:
java -jar searsiaserver.jar -m 'https://search.utwente.nl/searsia/didyoumean?q={searchTerms}'
Searsia retrieves the definition file and runs a local copy that
provides results from 'didyoumean
'.
In fact, we might also download the JSON definition file to our machine
and then start Searsia Server from there with the same effect:
wget https://search.utwente.nl/searsia/didyoumean.json
java -jar searsiaserver.jar -m didyoumean.json
Searsia Server tests a search engine using the -t
option as follows:
java -jar searsiaserver.jar -m didyoumean.json -t json
This will output the json
search results for the "testquery"
that
is specified in didyoumean.json
, or an error if there is a
problem with the definition file or the didyoumean
search engine.
Testing search engines will be needed frequently when setting up a new Searsia engine, or when maintaining an existing Searsia engine. More about adding engines to the federation can be found on the Protocol page.
Searsia server supports several command line options, shown in the following
table. For convenience, each option has a short-hand consisting of one hyphen
and the first letter of the option, for instance -h
for
--help
.
Option | Explanation |
---|---|
--cache <arg> | Set cache size (integer: number of result pages). The default is 500 pages. |
--dontshare | Do not share resource definitions. This will make it impossible for other servers to meaningfully connect to your server. |
--export | Export index to stdout and exit. |
--help | Show help. |
--interval <arg> | Takes as argument an integer, which is the poll interval (in seconds). The
server sends a random query each interval to a search engine.
The default value is 120 seconds.
If your server contains 30 resources (search engines), a polling interval of 120
seconds will poll each resource on average once per 120 * 30 = 3600 seconds,
so once each 60 minutes, about 24 queries a day. In scientific literature on
distributed information retrieval, polling is usually called query-based sampling.
An interval of 0 disables polling. |
--log <arg> | Takes as argument an integer, which is the type of log messages produced by the server (to the index directory). Supported levels are: 0 = no logging; 1 = only errors; 2 = errors and warnings; 3 = errors, warnings, and information, 4 = all of level 3 plus debug information. The default level is 2. |
--mother <arg> | REQUIRED. Sets the API template of the mother. See above for example values. |
--nohealth | Do not share health report. |
--path <arg> | Set the path on the file system where the index is stored. The default depends on your
operating system. Typically, the index ends up in your home directory under:
.local/share/searsia if your are on a Linux-based system,
under Library/Application Support/Searsia if you are on Apple,
and under Application Data/Searsia on many Windows versions.
|
--quiet | No output to console. |
--test <arg> | Print test output and exit (argument: 'json', 'xml', 'response', or 'all'). |
--url <arg> | Set the url of the web service endpoint. The default is 'http://localhost:16842/searsia/ ' |
The server might return the following error messages:
Setup failed: index_75cbc797a03dfe0dd08780e6c68e7bbc
Setup failed: Lock obtain timed out: NativeFSLock
Server failed: Failed to start Grizzly HTTP server: Address already in use
Server failed: Failed to start Grizzly HTTP server: Unresolved address
, orServer failed: Failed to start Grizzly HTTP server: Permission denied
Error: Connection failed: java.net.UnknownHostException
, orError: Connection failed: java.io.FileNotFoundException
Error: Connection failed: org.json.JSONException
searsiaclient.zip (213 KB)
searsiaserver.jar (11.2 MB)