Vipra
The Vipra application is a topic modeling based search system with a frontend web application, a backend REST service and a maintenance tool for data import and modeling. It attempts to leverage automatically discovered topic informations in document collections to ease collection browsing and organization. The search system relies on ElasticSearch and Apache Lucene.
This application was created by Eike Cochu for his master's degree thesis in computer science, 2015-2016 at the Freie Universität in Berlin, Germany.
Components
- vipra-backend: Backend application that connects to the database, filebase and search engine. README
- vipra-cmd: Backend utility tool to import and manage backend services. README
- vipra-ui: Frontend user interface that connects to the backend REST service. README
- vipra-util: Shared libraries and classes for backend applications and utility tools. README
Installation
Manual
- Deploy vipra.war or exploded vipra directory to a Java application server. Deploy to ROOT (/) or a different context path (/vipra, ...)
- Edit WEB-INF/classes/config.json if MongoDB or ElasticSearch do not run with default configuration (host/port)
- Deploy vipra-ui to a static webserver
- Create rewrite rules to rewrite all non-matching files to /index.html
- If not deployed in root, change /index.html tag to context path
- Change /js/config.js if the backend was not deployed on the same server, not in root context or if the server has a different port than 8080
if everything is left default, application should be available under: http://someserver/ and backend: http://someserver:8080/rest/
Docker
- Load docker image:
docker pull eikecochu/vipra
- Create and run docker container:
docker run -p 80:80 -p 6789:6789 -p 9300:9300 -p 27017:27017 eikecochu/vipra
- Install vipra: ``
Nginx Proxy
To proxy requests through nginx, use this server configuration (replace #SERVER_NAME_HERE#
):
server {
listen 80;
server_name #SERVER_NAME_HERE#;
access_log off;
error_log /var/log/nginx/vipra.error.log warn;
include include.d/security;
include include.d/ssl;
location / {
proxy_buffering off;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_pass http://localhost:80/;
}
location /ws {
proxy_pass http://localhost:80/ws;
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
Development
The following steps were reproduced in that order on a fully updated Ubuntu 15.10 virtual system to create a fully operational development environment for the Vipra projects. These steps are only required for project development. If you want to create a virtual machine for development, create a hard drive with at least 10 GB in size.
-
Install required libraries and tools
- Java 8
- MongoDB: the database
- git: to clone the project repository.
- maven3: to install project dependencies
- Tomcat 8: a JavaEE application server
- gsl, gflags: for building dtm (see below)
sudo apt-get install openjdk-8-jdk mongodb git maven tomcat8 libgsl0-dev
enable and test MongoDB:
sudo systemctl enable mongodb sudo systemctl restart mongodb mongo
if running from inside a VM, rebind MongoDB host:
nano /etc/mongodb.conf # comment out line: # bind_ip = 127.0.0.1 # save sudo systemctl restart mongodb
if watchman will be used (see below) additional packages are required for building watchman from source
build-essential autoconf automake python-dev
-
install a recent version of nodejs (the official repositories contain old versions)
curl -sL https://deb.nodesource.com/setup_5.x | sudo -E bash - sudo apt-get install nodejs # gulp needs a node executable, ubuntu only creates a nodejs executable sudo ln -s /usr/bin/nodejs /usr/bin/node
-
Install required node packages for frontend development
sudo npm install -g bower gulp
-
Build and install watchman. This step is optional but recommended, as nodejs watchers are less performant. This is only required if the
gulp watch
command will be used to continuously rebuild the frontend project assets.git clone https://github.com/facebook/watchman.git cd watchman ./autogen.sh ./configure make sudo make install
raise the maximum inotify watchers to allow watching more files for changes
echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
-
Install and enable ElasticSearch
wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.2.0/elasticsearch-2.2.0.deb sudo dpkg -i elasticsearch-2.2.0.deb rm elasticsearch-2.2.0.deb sudo systemctl enable elasticsearch sudo systemctl restart elasticsearch
if installed from within a VM, rebind ElasticSearch host:
sudo echo "network.bind_host: 0" >> /etc/elasticsearch/elasticsearch.yml sudo systemctl restart elasticsearch
and test if it is available:
curl -XGET 'http://localhost:9200/'
-
Install Eclipse EE. A different Java development IDE (or none at all) can be used, but the Vipra projects are made with eclipse and come with settings and run configurations. Install the M2Eclipse plugin for maven support in eclipse, if not yet installed. The plugin will download the required project dependencies and manage the project configurations, build versioning and packaging. Set up the installed Java EE application server in Eclipse. Add the
vipra-backend
project to the web server. -
Clone the project repository and edit the build.sh script
git clone https://somerepo.url/vipra.git cd vipra nano build.sh
make sure the 'TOMCAT_WEBAPPS' path points to the webapps directory of the Tomcat installation, if you want the WAR file to be deployed automatically.
to build the projects, run:
./build.sh
if auto deployment is disabled to the path to the webapps directory is wrong, you need to manually copy
./vipra-backend/target/vipra.war
to your servers web application directory.to check if the web application is runnung, run:
curl -XGET 'http://localhost:8080/vipra/rest/info'
Troubleshooting
-
On running maven:
Failed to read artifact descriptor for org.apache.maven.plugins:maven-resources-plugin:jar:2.3
sudo /var/lib/dpkg/info/ca-certificates-java.postinst configure
-
On running
npm install
sudo chown vipra.vipra -R ~/.npm
-
On running
bower install
with the error message/usr/bin/env: node: No such file or directory
sudo ln -s /usr/bin/nodejs /usr/bin/node