Skip to content
Snippets Groups Projects

Vipra

The Vipra application is a topic modeling based search system with a frontend web application, a backend REST service and a maintenance tool for data import and modeling. It attempts to leverage automatically discovered topic informations in document collections to ease collection browsing and organization. The search system relies on ElasticSearch and Apache Lucene.

This application was created by Eike Cochu for his master's degree thesis in computer science, 2015-2016 at the Freie Universität in Berlin, Germany.

Components

  • vipra-backend: Backend application that connects to the database, filebase and search engine. README
  • vipra-cmd: Backend utility tool to import and manage backend services. README
  • vipra-ui: Frontend user interface that connects to the backend REST service. README
  • vipra-util: Shared libraries and classes for backend applications and utility tools. README

Installation

Manual

  1. Deploy vipra.war or exploded vipra directory to a Java application server. Deploy to ROOT (/) or a different context path (/vipra, ...)
  2. Edit WEB-INF/classes/config.json if MongoDB or ElasticSearch do not run with default configuration (host/port)
  3. Deploy vipra-ui to a static webserver
  4. Create rewrite rules to rewrite all non-matching files to /index.html
  5. If not deployed in root, change /index.html tag to context path
  6. Change /js/config.js if the backend was not deployed on the same server, not in root context or if the server has a different port than 8080

if everything is left default, application should be available under: http://someserver/ and backend: http://someserver:8080/rest/

Docker

  1. Load docker image: docker pull eikecochu/vipra
  2. Create and run docker container: docker run -p 80:80 -p 6789:6789 -p 9300:9300 -p 27017:27017 eikecochu/vipra
  3. Install vipra: ``

Nginx Proxy

To proxy requests through nginx, use this server configuration (replace #SERVER_NAME_HERE#):

server {
    listen 80;
    server_name #SERVER_NAME_HERE#;

    access_log off;
    error_log /var/log/nginx/vipra.error.log warn;

    include include.d/security;
    include include.d/ssl;

    location / {
        proxy_buffering off;
        proxy_set_header Host $host;
        proxy_set_header X-Real-IP $remote_addr;
        proxy_pass http://localhost:80/;
    }

location /ws {
        proxy_pass http://localhost:80/ws;
        proxy_http_version 1.1;
        proxy_set_header Upgrade $http_upgrade;
        proxy_set_header Connection "upgrade";
    }
}

Development

The following steps were reproduced in that order on a fully updated Ubuntu 15.10 virtual system to create a fully operational development environment for the Vipra projects. These steps are only required for project development. If you want to create a virtual machine for development, create a hard drive with at least 10 GB in size.

  1. Install required libraries and tools

    • Java 8
    • MongoDB: the database
    • git: to clone the project repository.
    • maven3: to install project dependencies
    • Tomcat 8: a JavaEE application server
    • gsl, gflags: for building dtm (see below)
    sudo apt-get install openjdk-8-jdk mongodb git maven tomcat8 libgsl0-dev

    enable and test MongoDB:

    sudo systemctl enable mongodb
    sudo systemctl restart mongodb
    mongo

    if running from inside a VM, rebind MongoDB host:

    nano /etc/mongodb.conf
    # comment out line:
    # bind_ip = 127.0.0.1
    # save
    sudo systemctl restart mongodb

    if watchman will be used (see below) additional packages are required for building watchman from source

    build-essential autoconf automake python-dev
  2. install a recent version of nodejs (the official repositories contain old versions)

    curl -sL https://deb.nodesource.com/setup_5.x | sudo -E bash -
    sudo apt-get install nodejs
    # gulp needs a node executable, ubuntu only creates a nodejs executable
    sudo ln -s /usr/bin/nodejs /usr/bin/node
  3. Install required node packages for frontend development

    sudo npm install -g bower gulp
  4. Build and install watchman. This step is optional but recommended, as nodejs watchers are less performant. This is only required if the gulp watch command will be used to continuously rebuild the frontend project assets.

    git clone https://github.com/facebook/watchman.git
    cd watchman
    ./autogen.sh
    ./configure
    make
    sudo make install

    raise the maximum inotify watchers to allow watching more files for changes

    echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
  5. Install and enable ElasticSearch

    wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.2.0/elasticsearch-2.2.0.deb
    sudo dpkg -i elasticsearch-2.2.0.deb
    rm elasticsearch-2.2.0.deb
    sudo systemctl enable elasticsearch
    sudo systemctl restart elasticsearch

    if installed from within a VM, rebind ElasticSearch host:

    sudo echo "network.bind_host: 0" >> /etc/elasticsearch/elasticsearch.yml
    sudo systemctl restart elasticsearch

    and test if it is available:

    curl -XGET 'http://localhost:9200/'
  6. Install Eclipse EE. A different Java development IDE (or none at all) can be used, but the Vipra projects are made with eclipse and come with settings and run configurations. Install the M2Eclipse plugin for maven support in eclipse, if not yet installed. The plugin will download the required project dependencies and manage the project configurations, build versioning and packaging. Set up the installed Java EE application server in Eclipse. Add the vipra-backend project to the web server.

  7. Clone the project repository and edit the build.sh script

    git clone https://somerepo.url/vipra.git
    cd vipra
    nano build.sh

    make sure the 'TOMCAT_WEBAPPS' path points to the webapps directory of the Tomcat installation, if you want the WAR file to be deployed automatically.

    to build the projects, run:

    ./build.sh

    if auto deployment is disabled to the path to the webapps directory is wrong, you need to manually copy ./vipra-backend/target/vipra.war to your servers web application directory.

    to check if the web application is runnung, run:

    curl -XGET 'http://localhost:8080/vipra/rest/info'

Troubleshooting

  • On running maven: Failed to read artifact descriptor for org.apache.maven.plugins:maven-resources-plugin:jar:2.3

    sudo /var/lib/dpkg/info/ca-certificates-java.postinst configure
  • On running npm install

    sudo chown vipra.vipra -R ~/.npm
  • On running bower install with the error message /usr/bin/env: node: No such file or directory

    sudo ln -s /usr/bin/nodejs /usr/bin/node