Skip to content
Snippets Groups Projects
user avatar
Eike Cochu authored
7be21e66
History

Vipra

The Vipra application is a topic modeling based search system with a frontend web application, a backend REST service and a maintenance tool for data import and modeling. It attempts to leverage automatically discovered topic informations in document collections to ease collection browsing and organization. The search system relies on ElasticSearch and Apache Lucene.

This application was created by Eike Cochu for his master's degree thesis in computer science, 2015-2016 at the Freie Universität in Berlin, Germany.

Components

  • vipra-backend: Backend application that connects to the database, filebase and search engine. README
  • vipra-cmd: Backend utility tool to import and manage backend services. README
  • vipra-ui: Frontend user interface that connects to the backend REST service. README
  • vipra-util: Shared libraries and classes for backend applications and utility tools. README

Installation

  1. If MongoDB or ElasticSearch run on different servers that the JavaEE application server, then the configuration files need to be changed. Change config.properties file
  2. in vipra.war: Open vipra.war in an archive program, navigate to /WEB-INF/classes and edit config.properties file appropriately.
  3. in vipra-cmd.jar: Open vipra-cmd.jar and edit config.properties file appropriately.
  4. Test connection by running ./vipra -t
  5. Copy vipra.war to your JavaEE application server

Development

The following steps were reproduced in that order on a fully updated Ubuntu 15.10 virtual system to create a fully operational development environment for the Vipra projects. These steps are only required for project development. If you want to create a virtual machine for development, create a hard drive with at least 10 GB in size.

  1. Install required libraries and tools

    • Java 8
    • MongoDB: the database
    • git: to clone the project repository.
    • maven3: to install project dependencies
    • Tomcat 8: a JavaEE application server
    • gsl, gflags: for building dtm (see below)
    sudo apt-get install openjdk-8-jdk mongodb git maven tomcat8 libgsl0-dev

    enable and test MongoDB:

    sudo systemctl enable mongodb
    sudo systemctl restart mongodb
    mongo

    if running from inside a VM, rebind MongoDB host:

    nano /etc/mongodb.conf
    # comment out line:
    # bind_ip = 127.0.0.1
    # save
    sudo systemctl restart mongodb

    if watchman will be used (see below) additional packages are required for building watchman from source

    build-essential autoconf automake python-dev
  2. install a recent version of nodejs (the official repositories contain old versions)

    curl -sL https://deb.nodesource.com/setup_5.x | sudo -E bash -
    sudo apt-get install nodejs
    # gulp needs a node executable, ubuntu only creates a nodejs executable
    sudo ln -s /usr/bin/nodejs /usr/bin/node
  3. Install required node packages for frontend development

    sudo npm install -g bower gulp
  4. Build and install watchman. This step is optional but recommended, as nodejs watchers are less performant. This is only required if the gulp watch command will be used to continuously rebuild the frontend project assets.

    git clone https://github.com/facebook/watchman.git
    cd watchman
    ./autogen.sh
    ./configure
    make
    sudo make install

    raise the maximum inotify watchers to allow watching more files for changes

    echo fs.inotify.max_user_watches=524288 | sudo tee -a /etc/sysctl.conf && sudo sysctl -p
  5. Install and enable ElasticSearch

    wget https://download.elasticsearch.org/elasticsearch/release/org/elasticsearch/distribution/deb/elasticsearch/2.2.0/elasticsearch-2.2.0.deb
    sudo dpkg -i elasticsearch-2.2.0.deb
    rm elasticsearch-2.2.0.deb
    sudo systemctl enable elasticsearch
    sudo systemctl restart elasticsearch

    if installed from within a VM, rebind ElasticSearch host:

    sudo echo "network.bind_host: 0" >> /etc/elasticsearch/elasticsearch.yml
    sudo systemctl restart elasticsearch

    and test if it is available:

    curl -XGET 'http://localhost:9200/'
  6. Install Eclipse EE. A different Java development IDE (or none at all) can be used, but the Vipra projects are made with eclipse and come with settings and run configurations. Install the M2Eclipse plugin for maven support in eclipse, if not yet installed. The plugin will download the required project dependencies and manage the project configurations, build versioning and packaging. Set up the installed Java EE application server in Eclipse. Add the vipra-backend project to the web server.

  7. Clone the project repository and edit the build.sh script

    git clone https://somerepo.url/vipra.git
    cd vipra
    nano build.sh

    make sure the 'TOMCAT_WEBAPPS' path points to the webapps directory of the Tomcat installation, if you want the WAR file to be deployed automatically.

    to build the projects, run:

    ./build.sh

    if auto deployment is disabled to the path to the webapps directory is wrong, you need to manually copy ./vipra-backend/target/vipra.war to your servers web application directory.

    to check if the web application is runnung, run:

    curl -XGET 'http://localhost:8080/vipra/rest/info'

Troubleshooting

  • On running maven: Failed to read artifact descriptor for org.apache.maven.plugins:maven-resources-plugin:jar:2.3

    sudo /var/lib/dpkg/info/ca-certificates-java.postinst configure
  • On running npm install

    sudo chown vipra.vipra -R ~/.npm
  • On running bower install with the error message /usr/bin/env: node: No such file or directory

    sudo ln -s /usr/bin/nodejs /usr/bin/node