Commit 34a31d5d by Adi Amir

Merge branch 'master' of https://municipalitybank.com/mcx/devops

parents 6e80cf00 1d3779a7
......@@ -218,6 +218,8 @@ es2csv -r -q '{ "query":{"bool":{"must":[{"query_string":{"query":"tenant:fremon
es2csv -r -q '{ "query": { "bool": { "must": [ { "query_string":{ "query":"metaData.event:TrafficJam" } } ], "filter" : { "geo_bounding_box" : { "metaData.loc" :{ "top_left" : { "lat" : 37.460391, "lon" : -122.167106 }, "bottom_right" : { "lat" : 37.152796, "lon" : -121.538847 } } }}}}}' -u 167.99.206.187:9200 -o activities-sanjose-traffic-reports.csv -m 5000000 -i activityidx -f creationTime id type metaData
es2csv -r -q '{ "query":{"bool":{"must":[{"query_string":{"query":"tenant:fremont & type:\"report/*\"" }}],"filter":[{"range": {"published": {"gte": 1576328644000,"format": "epoch_millis" }}},{"geo_bounding_box":{"metaData.loc":{"top_left":{"lat" : 37.629134,"lon" : -122.173405 }, "bottom_right" : { "lat" : 37.457418, "lon" : -121.835116 }}}}]}}}' -u 172.16.1.80:9200 -o reports-fremont.csv -m 100 -i activityidx -f creationTime id type metaData
// CURATOR
command line app for managing elasticsearch indices.
Install:
......
INSTALL ON 16.04
- pre installation:
sudo apt-get install -y software-properties-common python-software-properties
Install the gcc-7 packages:
sudo apt-get install -y software-properties-common
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt update
sudo apt install g++-7 -y
Set it up so the symbolic links gcc, g++ point to the newer version:
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 60 \
--slave /usr/bin/g++ g++ /usr/bin/g++-7
sudo update-alternatives --config gcc
gcc --version
g++ --version
# This one if you want the **all** toolchain programs (with the triplet names) to also point to gcc-7.
# For example, this is needed if building Debian packages.
# If you are already are root (e.g. inside a docker image), remove the "sudo" below.
ls -la /usr/bin/ | grep -oP "[\S]*(gcc|g\+\+)(-[a-z]+)*[\s]" | xargs sudo bash -c 'for link in ${@:1}; do ln -s -f "/u
PULSAR - FLINK ADMIN:
PULSAR:
This is a pub/sub msg queue (bus) where domains and stream processing jobs communicates via topics
Topic structure (url) is:
[durability://][tenant]/[namespace]/[topic]
e.g: persistent://mcx/activities/activity
Domains topics:
- all our topics are under 'mcx' tenant
- every domain has it's own namespace according to it's service name (default behavior)
e.g activities domain namespace is 'activities'
- the topics are the api's e.g /activity api will have the topic url: 'persistent://mcx/activities/activity'
- every domain creates it's own topic on startup, except activities that also creates the 'public' topics
Streaming jobs:
- all the streaming jobs have topics under 'public' namespace in the 'mcx' tenant
e.g. the cep-activities job subscribe to the topics: 'persistent://mcx/public/activities' and 'persistent://mcx/public/activity'
Docker:
- in the compose file under pulsar service
- there are 3 ports to open/map:
- 80 for web interface ( only for pulsar-standalone or pulsar-dashboard)
- 8080 for admin
- 6650 the actual pubsub port
Web Interface:
- There is an internal web interface if you run pulsar-standalone docker, but it is not
showing the actual topics in real-time
- pulsar-express:
https://github.com/bbonnin/pulsar-express
This is a remote interface that connects to the pulsar (like the db tools for mongo/redis etc')
you can install it using:
npm install pulsar-express -g (add sudo if you're not on root)
when creating connections use the admin port mapping
FLINK:
This is a stream processing framework, is receives data from 'sources' (subscribes to topics),
receive the msgs and process them in real time, and outputs (if there is output) to 'sinks' ( like http or public to topics)
Every processing unit is called a JOB, each job has several tasks that runs on cluster of nodes.
- the flink framework has a job manager and a task manager that has one or more nodes
- the jobs are loaded to the flink from web interface/commandline or from pre-loaded directory
- our jobs are published to the archiva so that they can be downloaded anywhere and uploaded to
the flink framework, the path is http://municipalitybank.com:8081/repository/internal/com/ipgallery/esp/[job-name]/[version]/[job-artifact].jar
e.g: http://municipalitybank.com:8081/repository/internal/com/ipgallery/esp/activities-cep-job/1.0.0/activities-cep-job-1.0.0-all.jar
- the jobs doesn't create topics on pulsar, so the activities service must be up before uploading the job, and the pulsar
service must be up before starting the activities service
Docker:
- there are to docker services, one for job manager and one for task manager
- port mapping 8081 for web interface
- if the flink is running on a different network/machine then we need to define the domains needed by the jobs
in the extra_hosts
- since the jobs runs under the flink instances, the jobs logs is sent to the consule log which will appear in the instances docker log.
Web Interface:
- in port 8081, under 'Submit new Job' upload the job, click on it and click 'Submit'
- it may fail on first try due to timeouts, try again before searching for errors in the logs
FROM drissamri/java:jre8
#FROM drissamri/java:jre8
#FROM registry.ng.bluemix.net/ibmliberty
FROM openjdk:8-jdk-slim
RUN mkdir -p /logs/conf
......
FROM flink:1.7
#FROM flink:1.7
FROM flink:1.9
ADD bcprov-jdk16-1.45.jar /opt/flink/lib/
......@@ -2,7 +2,7 @@ version: "2"
services:
jobmanager:
# image: flink:1.7
image: municipalitybank.com:5050/mcx/devops/flink-pulsar
image: municipalitybank.com:5050/mcx/devops/flink-pulsar:1.9
expose:
- "6123"
ports:
......@@ -10,12 +10,17 @@ services:
command: jobmanager
environment:
- JOB_MANAGER_RPC_ADDRESS=jobmanager
extra_hosts:
- "alerts:172.16.1.244"
- "scp:172.16.1.72"
# volumes:
# - "/opt/flink/conf/flink-conf.yaml:/home/amir/git/devops/docker/composers/flink-conf.yaml"
networks:
- backend
taskmanager:
# image: flink:1.7
image: municipalitybank.com:5050/mcx/devops/flink-pulsar
image: municipalitybank.com:5050/mcx/devops/flink-pulsar:1.9
expose:
- "6121"
- "6122"
......@@ -26,17 +31,22 @@ services:
- "jobmanager:jobmanager"
environment:
- JOB_MANAGER_RPC_ADDRESS=jobmanager
extra_hosts:
- "alerts:172.16.1.244"
- "scp:172.16.1.72"
# volumes:
# - "/opt/flink/conf/flink-conf.yaml:/home/amir/git/devops/docker/composers/flink-conf.yaml"
networks:
- backend
pulsar:
image: apachepulsar/pulsar-standalone
image: apachepulsar/pulsar-standalone:2.4.2
ports:
- 8080:8080
- 8082:80
- 6650:6650
volumes:
- "/ext/pulsar/data:/pulsar/data"
# volumes:
# - "/ext/pulsar/data:/pulsar/data"
networks:
- backend
networks:
......
# FROM gcc:4.9
# LABEL maintainer="amir.aharon@ipgallery.com"
# WORKDIR /project
# RUN echo "deb http://ftp.debian.org/debian jessie-backports main" > /etc/apt/sources.list.d/jessie-backports.list \
# && apt-get update && apt-get -t jessie-backports install -y --no-install-recommends \
# gdb cmake \
# && apt-get clean \
# && rm -rf /var/lib/apt/lists/* /etc/apt/sources.list.d/jessie-backports.list
FROM ubuntu:xenial
LABEL maintainer="amir.aharon@ipgallery.com"
VOLUME "/project"
WORKDIR "/project"
RUN apt-get update
RUN apt-get install -y software-properties-common python-software-properties
RUN add-apt-repository ppa:ubuntu-toolchain-r/test
RUN apt-get update
# RUN apt-get dist-upgrade -y
RUN apt-get install gcc-7 g++-7 make gdb gdbserver wget -y && \
apt-get clean autoclean && \
apt-get autoremove -y && \
rm -rf /var/lib/apt/lists/*
# wget -O /tmp/conan.deb -L https://github.com/conan-io/conan/releases/download/0.25.1/conan-ubuntu-64_0_25_1.deb && \
# dpkg -i /tmp/conan.deb
RUN cd /opt \
&& wget https://cmake.org/files/v3.10/cmake-3.10.1-Linux-x86_64.tar.gz \
&& tar xf cmake-3.10.1-Linux-x86_64.tar.gz \
&& rm cmake-3.10.1-Linux-x86_64.tar.gz \
&& ln -sf /opt/cmake-3.10.1-Linux-x86_64/bin/cmake /usr/bin/cmake
RUN update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-7 999 \
&& update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-7 999 \
&& update-alternatives --install /usr/bin/cc cc /usr/bin/gcc-7 999 \
&& update-alternatives --install /usr/bin/c++ c++ /usr/bin/g++-7 999
ADD ./scripts/cmake-build.sh /build.sh
RUN chmod +x /build.sh
RUN useradd -ms /bin/bash develop
RUN echo "develop ALL=(ALL:ALL) ALL" >> /etc/sudoers
EXPOSE 2000
USER develop
VOLUME "/home/develop/project"
WORKDIR "/home/develop/project"
#ENTRYPOINT exec /build.sh
\ No newline at end of file
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or sign in to comment