Category: Blog

rabbit disease model

Previous exposure to myxomatosis reduces survival of European rabbits during outbreaks of rabbit haemorrhagic disease

Accompanies paper:

Barnett, LK, TAA Prowse, DE Peacock, GJ Mutze, R Sinclair, J Kovaliski, B Cooke, CJA Bradshaw. 2018. Previous exposure to myxoma virus reduces survival of European rabbits during outbreaks of rabbit haemorrhagic disease. Journal of Applied Ecology 55: 2954-2962

Barnett, Louise K., Flinders University
Prowse, Thomas A. A., University of Adelaide
Peacock, David E, Biosecurity South AustraliaDepartment of Primary Industries and Regions Adelaide South Australia Australia
Mutze, Gregory J, Biosecurity South AustraliaDepartment of Primary Industries and Regions Adelaide South Australia Australia
Sinclair, Ron G., University of Adelaide
Kovaliski, John, Biosecurity South AustraliaDepartment of Primary Industries and Regions Adelaide South Australia Australia
Cooke, Brian D., University of Canberra
Bradshaw, Corey J. A., Flinders University

Publication date: 18 May 2019

Publisher: Dryad doi:10.5061/dryad.j91d66c

Data/code citation

Barnett, Louise K. et al. (2019), Data from: Previous exposure to myxomatosis reduces survival of European rabbits during outbreaks of rabbit haemorrhagic disease, Dryad, Dataset, doi:10.5061/dryad.j91d66c

Abstract

Exploiting disease and parasite synergies could increase the efficacy of biological control of invasive species. In Australia, two viruses were introduced to control European rabbits Oryctolagus cuniculus — myxoma virus in 1950, and rabbit haemorrhagic disease virus in 1995. While these biological controls caused initial declines of > 95% in affected populations, today rabbits remain a problem in many areas, despite recurring outbreaks of both diseases.
We used eighteen years of capture-mark-recapture, dead recovery, and antibody assay data from a sentinel population in South Australia to test whether these two diseases interact to modify the survival of individual wild rabbits. We compared four joint, multi-state, dead-recovery models to test the hypotheses that rabbit haemorrhagic disease and myxoma viruses have synergistic (i.e., previous exposure to one virus affects survival during outbreaks of the other virus) or additive effects (i.e., previous exposure to one virus does not affect survival during outbreaks of the other virus).
Rabbit haemorrhagic disease outbreaks reduced the survival of individuals with no immunity by more than half during the 58-day capture-trip intervals, i.e., from 0.86–0.90 to 0.37–0.48. Myxomatosis outbreaks had a smaller effect, reducing survival to 0.74– 0.82; however, myxomatosis outbreaks were more prolonged, spanning more than twice as many trips.
There was considerable information-theoretic support (wAIC_c = 0.69) for the model in which exposure to myxomatosis affected survival during rabbit haemorrhagic disease outbreaks. Rabbits previously exposed to myxoma virus had lower survival during rabbit haemorrhagic disease outbreaks than rabbits never exposed to either virus. There was negligible support for the model in which previous exposure to rabbit haemorrhagic disease affected survival in myxomatosis outbreaks (wAIC_c < 0.01).
Synthesis and applications — Our results indicate that biological control agents can have a greater impact than single-pathogen challenge studies might suggest. Introducing additional biological control agents might therefore increase mortality of rabbits beyond the additive effects of individual biological controls. Furthermore, our results show that by understanding and exploiting disease synergies, managers could increase the efficacy of biological controls for other invasive animals.

Usage Notes

Rabbit capture histories

Individual capture histories by immunity state. “0” = not captured. “N” = captured with no immunity. “M” = captured with myxoma virus immunity, “R” = captured with rabbit haemorrhagic disease virus immunity, “B” = captured with immunity to both rabbit haemorrhagic disease virus and myxoma virus. Data collected by Biosecurity, South Australia, Department of Primary Industries and Regions

CaptureHist.csv

Trip Covariates

Details of each trapping trip, including ‘date’; whether the trip was classified as an outbreak of rabbit haemorrhagic disease virus (‘RHDV’) or myxoma virus (‘MV’); intervals between trips (‘Ints’) = time between one trip and the next, expressed as a ratio of the mean interval between trips (58 days)—the last interval is the time between the last trapping trip and the last dead recovery; trapping effort (‘SEffort’), the number of rabbits known to be alive (‘KTBA’), estimated population size based on a POPAN model (‘POPAN’) and associated lower and upper confidence limits (‘POPANlcl’ and ‘POPANucl’); and the number of days since the first trip. Data collected by Biosecurity, South Australia, Department of Primary Industries and Regions.

TripCovariates.csv

Rabbit multi-state, dead recovery models

Multi-state, dead-recovery models for 18 years of data from Turretfield rabbit population, South Australia. Code by Louise K. Barnett, November 2017 Before running the script you will need to install program MARK. Mac and Linux users might find this post helpful.

Rabbit_Multistate_DeadRecovery.R

Plotting output of multi-state, dead-recovery model

This script is for plotting the output from the multi-state model script. Run the models and save the output first. Notes- Immunity state / previous exposure categories: N – Immunity to neither virus M – Immunity to myxoma virus only R – Immunity to rabbit haemorrhagic disease virus (RHDV) only B – Immunity to both viruses Age groups: Kittens ≤ 600 g (may have residual maternal immunity to RHD) Adults > 600 g (unlikely to have residual maternal immunity)

Rabbit_Multistate_Plotting_Output.R

BitsOJ

Offline Judge for competitive programming contests.

Setup

Run this script to bypass the following steps:

sudo chmod +x configure.sh
./configure.sh

Or, run these commands manually:

1.Update the system:

sudo apt-get update
sudo apt-get upgrade

2.Install ErLang

cd ~
wget http://packages.erlang-solutions.com/site/esl/esl-erlang/FLAVOUR_1_general/esl-erlang_20.1-1~ubuntu~xenial_amd64.deb
sudo dpkg -i esl-erlang_20.1-1\~ubuntu\~xenial_amd64.deb
Check your ErLang installation by running:
erl

3.Install RabbitMQ

Add the Apt repository to your Apt source list directory (/etc/apt/sources.list.d):

echo "deb https://dl.bintray.com/rabbitmq/debian xenial main" | sudo tee /etc/apt/sources.list.d/bintray.rabbitmq.list
Next add our public key to your trusted key list using apt-key:
wget -O- https://www.rabbitmq.com/rabbitmq-release-signing-key.asc | sudo apt-key add -
sudo apt-get update
sudo apt-get install rabbitmq-server

4.Start the RabbitMQ server:

sudo systemctl start rabbitmq-server.service
sudo systemctl enable rabbitmq-server.service
To check status of RabbitMQ server,
sudo rabbitmqctl status

5.Create a new admin account

You should give custom values to user_name and user_password in the next command:

sudo rabbitmqctl add_user user_name user_password
sudo rabbitmqctl set_user_tags user_name administrator
sudo rabbitmqctl set_permissions -p / user_name ".*" ".*" ".*"

6.Enable RabbitMQ management console

sudo rabbitmq-plugins enable rabbitmq_management
sudo chown -R rabbitmq:rabbitmq /var/lib/rabbitmq/
Visit : http://localhost:15672/ and login using user_name and user_password

7.Install Pika

sudo pip3 install pika

8.Install PyQt5

sudo pip3 install pyqt5

9. For testing purposes, add following users into RabbitMQ management portal:

Username Password Status Permissions

BitsOJ root administrator All
client client None vhost
judge1 judge1 management All

And you’re done!!!!

This is a test version of the BitsOJ system. Many security features are not pushed on the web for obvious reasons.

Download the executables for the complete software.

Sensor Reading

Decode and display data recieved through a serial port from sensors. The data is sent as “protocol buffers” with the Consistent Overhead Byte Stuffing algorithm for the data transmission over serial port.

Build the project

Requirements

Python

protobuf
pyserial
cobs

C++

Protocol buffer

Make

You can use GNU make with the following rules:

[all] : compiles all the protoc files and the sensor_reading binary (python and c++)
[pb-python] : generates python protobuf files from sensor.proto
[pb-cpp] : generates c++ protobuf files from sensor.proto
[init] : installs the python requirements with pip
[build] : generates a sensor_reading binary from the c++ code.
[debug] : compiles the c++ code with debug flags
[clean] : cleans the project

Usage

Usage python: ./sensor_reading.py [Serial Port] [baudrate] (timeout)

Usage c++: ./sensor_reading [Serial Port] [baudrate] (timeout)

The two first arguments are mandatory. The timeout value is expected as an integer in seconds. It is optionnal. If it is not specified the program will wait on the serial.

Message sent

The message recieved can be customized through the sensor.proto file. For now it is composed of two uint32 and two floats as followed:

message SensorReading {
    uint32 id = 1;
    uint32 co2 = 2;
    float temperature = 3;
    float humidity = 4;
}

Sources

Python

The code is the file sensor_reading.py in the python folder. To execute the script you have to at least build the python protobuf file with the make pb-python rule.

C++

When you make or make build the binary sensor_reading is created at the root of the repository. The c++ code is available in the sensor_reading.cc file in the cpp directory.

Tests

The program was tested on a debian architechture. Two python scripts are available in the tests directory:

sensor_writing.py : Script to encode in protobuf and cobs some dummy sensor datas.
serial_simulator.py : Creates a dummy serial port wich behaves like a pipe based on a pseudoterminal on linux.

Run the tests

To run the test first of all make the project. Then launch the serial_simulator script. Il will print the serial port available.

For example:

$ > ./tests/serial_simulator.py
Hey use this serial port: /dev/pts/5

Following the example above you then have to launch the sensor_reading program on the serial port /dev/pts/5 and then launch the sensor_writing.py script also on the serial port /dev/pts/5. The sensor writing script takes the same parameters as the sensor reading.

Author

Loic Banet

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

Sensor Reading

Decode and display data recieved through a serial port from sensors. The data is sent as “protocol buffers” with the Consistent Overhead Byte Stuffing algorithm for the data transmission over serial port.

Build the project

Requirements

Python

protobuf
pyserial
cobs

C++

Protocol buffer

Make

You can use GNU make with the following rules:

[all] : compiles all the protoc files and the sensor_reading binary (python and c++)
[pb-python] : generates python protobuf files from sensor.proto
[pb-cpp] : generates c++ protobuf files from sensor.proto
[init] : installs the python requirements with pip
[build] : generates a sensor_reading binary from the c++ code.
[debug] : compiles the c++ code with debug flags
[clean] : cleans the project

Usage

Usage python: ./sensor_reading.py [Serial Port] [baudrate] (timeout)

Usage c++: ./sensor_reading [Serial Port] [baudrate] (timeout)

The two first arguments are mandatory. The timeout value is expected as an integer in seconds. It is optionnal. If it is not specified the program will wait on the serial.

Message sent

The message recieved can be customized through the sensor.proto file. For now it is composed of two uint32 and two floats as followed:

message SensorReading {
    uint32 id = 1;
    uint32 co2 = 2;
    float temperature = 3;
    float humidity = 4;
}

Sources

Python

The code is the file sensor_reading.py in the python folder. To execute the script you have to at least build the python protobuf file with the make pb-python rule.

C++

When you make or make build the binary sensor_reading is created at the root of the repository. The c++ code is available in the sensor_reading.cc file in the cpp directory.

Tests

The program was tested on a debian architechture. Two python scripts are available in the tests directory:

sensor_writing.py : Script to encode in protobuf and cobs some dummy sensor datas.
serial_simulator.py : Creates a dummy serial port wich behaves like a pipe based on a pseudoterminal on linux.

Run the tests

To run the test first of all make the project. Then launch the serial_simulator script. Il will print the serial port available.

For example:

$ > ./tests/serial_simulator.py
Hey use this serial port: /dev/pts/5

Following the example above you then have to launch the sensor_reading program on the serial port /dev/pts/5 and then launch the sensor_writing.py script also on the serial port /dev/pts/5. The sensor writing script takes the same parameters as the sensor reading.

Author

Loic Banet

License

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

charcoal-helper

Web API squid redirector helper client

Copyright Unmukti Technology Private Limited, India

Licensed under GNU General Public License. See LICENCE for more details.

System Requirements

Squid helper is written in Perl and is currently running on following systems:

OpenWRT / LEDE Project on
- PCEngines ALIX
- PCEngines APU
- PCEngines APU2
- Routerboard RB951Ui-2HnD
PfSense on x86 and AMD64 (ALIX & APU/APU2 included)

It can run on any POSIX compliant Unix system which has:

Perl >= 5.14.x
- IO::Socket
- optionally Cache::Memcached::Fast (for memcached enabled helper)
- Cache:Memcached on OpenWrt

Memcached Support

A local memcached server is suggested on Squid machine. Cache::Memcached::Fast module is required to use memcached. Helper files with -memcached in the names use memcached, if available.

Default time for caching the results is 60 seconds.

my $CACHE_TIME = 60;

The result for each request is cached in memcached and charcoal server is not queried unless result is not found in the cache.

To install memcached on your machine, please refer to the documentation provided by your distribution. For Debian/Ubuntu, you may follow these steps:

apt-get update

apt-get install memcached

systemctl enable memcached

systemctl restart memcached

Following is a transcript of a successful telnet session to memcached:

telnet localhost 11211
Trying 127.0.0.1...
Connected to localhost.
Escape character is '^]'.
stats
STAT pid 1108
STAT uptime 11402
STAT time 1503303182
STAT version 1.4.25 Ubuntu
STAT libevent 2.0.21-stable
STAT pointer_size 64
STAT rusage_user 0.144000
STAT rusage_system 0.176000
STAT curr_connections 5
STAT total_connections 7
...
...
...
END
quit
Connection closed by foreign host.

Squid Versions supported

Squid-2.x is supported in compatibility mode with -c argument to the helper.
Squid > 3.x are supported natively as external acl helper.

Setup and Configuration

Add following lines to squid.conf:

Configuration as External ACL Helper:

http_access deny !safe_ports
http_access deny connect !ssl_ports

external_acl_type charcoal_helper ttl=60 negative_ttl=60 children-max=X children-startup=Y children-idle=Z concurrency=10 %URI %SRC %IDENT %METHOD %% %MYADDR %MYPORT /etc/config/squid-helpers/charcoal-helper-ext-memcached.pl <API_KEY>
acl charcoal external charcoal_helper
http_access deny !charcoal

http_access allow localhost manager
http_access deny manager

Configuration as URL Rewrite Program not recommended:

url_rewrite_program /path/to/charcoal-helper.pl YOUR_API_KEY
url_rewrite_children X startup=Y idle=Z concurrency=1

Adjust the values of X, Y and Z for your environment. Typically, X=10, Y=2 and Z=1 works fine on
ALIX and Routerboard with around 10 machines in the network.

In order to obtain API key, kindly write to charcoal@hopbox.in

Alternatively, self-host charcoal server – https://github.com/hopbox/charcoal/

Managing the ACL rules

Head to my.charcoal.io and login with the credentials provided with the API key.

MRI Report Generator with Google Gemini and PDF Export

This project is a Streamlit-based web application that generates MRI brain reports using a YOLOv8 model for image segmentation and Google Gemini for AI-powered report generation. The application allows users to upload MRI images, segment them to identify tumors, and generate a detailed PDF report with the segmented image.

Features

Upload MRI Images: Users can upload MRI images with mask overlays for processing.
YOLOv8 Segmentation: The application uses a YOLOv8 model to segment and detect regions of interest in the MRI images.
Google Gemini Integration: The application integrates with Google Gemini to generate detailed MRI reports based on the detected regions.
PDF Report Generation: The application creates a downloadable PDF report that includes the segmented MRI image and the AI-generated report.

How It Works

Upload MRI Image: Users upload an MRI image (PNG, JPG, or JPEG) with mask overlay.
Image Segmentation: The YOLOv8 model processes the image to segment regions of interest, such as tumors.
AI Report Generation: Google Gemini generates a detailed report based on the segmented image and findings.
Download PDF Report: The application generates a PDF file with the segmented image and AI-generated report, which can be downloaded by the user.

Installation

To run this application locally, follow these steps:

Clone the repository:

Copy code
git clone https://github.com/abwahab175/mri-report-generator.git
cd mri-report-generator

Install the required dependencies:

Copy code
pip install -r requirements.txt

Run the Streamlit application:

Copy code
streamlit run app.py

Usage

Launch the application by running the above command.
Upload an MRI image with a mask overlay using the file uploader.
Click the “Generate PDF Report” button to process the image and generate the report.
Download the generated PDF report.

Dependencies

Python 3.x
Streamlit
FPDF
PIL (Pillow)
YOLOv8
Google Gemini API

Contributing

Contributions are welcome! If you have suggestions, enhancements, or bug fixes, please follow the steps below:

Fork the project.
Create your feature branch (git checkout -b feature/YourFeature).
Commit your changes (git commit -m 'Add some feature').
Push to the branch (git push origin feature/YourFeature).
Open a pull request.

License

Distributed under the MIT License. See LICENSE.txt for more information.

Contact

Abdul Wahab – abwahab175@gmail.com

OT1D: Discrete Optimal Transport in 1D by Linear Programming

The OT1D library offers a simple but efficient implementation of an algorithm to compute the Kantorovich-Wasserstein distance between two empirical measures defined in dimension 1, that is, the support points of the measures are in R. We have designed the algorithm by directly exploiting the Complementary slackness conditions of Linear Programming. The implementation focuses more on efficiency than genericity, and we try to be as efficient as possible in several notable cases. We implemented the core algorithm in standard ANSI C++11, and we provide a python3 wrapper, which can be installed with:

pip3 install ot1d

The OT1D library provides an implementation of Optimal Transport in 1D that is faster than:

Scipy: it is at least 6x faster than scipy.stats.wasserstein_distance, but it can be up to 11x faster
POT: it is at least 2x faster than ot.lp.wasserstein_1d, but it can be up to 7x faster

The real speedup will depend on your computer platform (i.e., numebr of cores), your OS, and compiler. For running a performance test on your computer, see below, or run the python script OT1D_test. For some strange reason, the speed ups on Mac laptops are larger than for other architectures.

REMARK: If you find instances where OT1D is slower, please, let us know.

DotLIB

This tiny library is part of dotlib, a large project to develop Optimal Transport algorithms based on efficient Linear Programming implementations.

Basic Usage: Colab Notebook

The simplest way to test this library is to run the following notebook on Colab:

Data	Notebook	Link
[2021/06/21]	Testing and evaluating OT1D

Usage

The main function of the OT1D library is the following:

z = OT1D(x, y, mu=None, nu=None, p=2, sorting=True, threads=8, plan=False)

The parameters of the function are:

x: the support points of the first measure
y: the support points of the second measure
mu: the weights of the first measure. If equal to None, all the samples have the same mass.
nu: the weights of the second measure. If equal to None, all the samples have the same mass.
p: the order of the Wasserstein distance (p=1 or p=2)
sorting: if equal to True, the function sorts the support points given in input
threads: number of threads to use by the parallel sorting algorithm
plan: if equal to True , the function returns the optimal transportation plan (see example interpolate.py)

The first four parameters can be given in input as numpy arrays (preferred) or python lists.

Sorting at the speed of light

In addition, we expose the following in-place parallel sorting function:

parasort(x, mu=None, threads=8)

The parameters of the function are:

x: the support points of a given measure
mu: the weights of the given measure. If equal toNone, only the support points are sorted
threads: number of threads to use by the parallel sorting algorithm

The first two parameters can be given in input as numpy arrays (preferred) or python lists.

Details

Given two empirical distributions, the Kantorovich-Wasserstein distance is the given by optimal solution of a linear program, known as the transportation problem. While this is a general linear program, when the costs are defined among points belonging to the real line, the problem can be solved with an algorithm having worst-case time complexity of O(n log n). This can be shown by writing first the dual linear program, and then the slackness condition.

The key step of the algorithm is sorting of the two arrays of support points x and y. We sort the arrays by using a customized parallel sorting algorithm implemented in C++, which combines the very fast pdqsort with parasort. See the linked webpages for the license type of these two libraries.

Prerequisities for compilation

You want to compile the source code and the python wrapper, you only need the following two standard python libraries:

A C++ compiler compliant with the C++11 standard.
cython
numpy

You might need to install python-dev library, which on Linux can be installed by:

apt install python3-dev  # Ubuntu

Installation

To install OT1D you can run the following command:

pip3 install ot1d

Testing

For testing the library, you can run the following command:

python3 basic_test.py

The basic test snippet is the following:

import numpy as np
from OT1D import OT1D, parasort

np.random.seed(13)

N = 1000000

# Uniform samples
x = np.random.uniform(1, 2, N)
y = np.random.uniform(0, 1, N)

z = OT1D(x, y, p=2, sorting=True, threads=16)

print('Wasserstein distance of order 2, W2(x,y) =', z)

and the output should be similar to:

Wasserstein distance of order 2, W2(x,y) = 1.0002549459050794

Testing for Performance

These results can be reproduced running the following command (you need to have installed scipy and pot):

python3 OT1D_test.py

which output is should be similar to the following (but it depends on your platform):

--------------- TEST 3: Unsorted input (average runtime) --------------------
For OT1D using 8 threads

running test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Testing W1, samples of deltas, n=m
Scipy: average time = 0.214 speedup = 11.0
POT  : average time = 0.122 speedup = 6.3O
OT1D : average time = 0.019 speedup = 1.0

Testing W2, samples of deltas, n=m
POT  : average time = 0.12 speedup = 6.1
OT1D : average time = 0.02 speedup = 1.0

Testing W1, samples with weights
Scipy: average time = 0.225 speedup = 7.7
POT  : average time = 0.121 speedup = 4.2
OT1D : average time = 0.029 speedup = 1.0

Testing W2, samples with weights
POT  : average time = 0.119 speedup = 4.1
OT1D : average time = 0.029 speedup = 1.0


--------------- TEST 4: Sorted input (average runtime) --------------------
For OT1D using 8 threads

running test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Parallel sorting: time = 0.023 sec

Testing W1, samples of deltas, n=m
Scipy: average time = 0.07 speedup = 11.4
POT  : average time = 0.043 speedup = 7.1
OT1D : average time = 0.006 speedup = 1.0

Testing W2, samples of deltas, n=m
POT  : average time = 0.042 speedup = 7.0
OT1D : average time = 0.006 speedup = 1.0

Testing W1, samples with weights
Scipy: average time = 0.078 speedup = 5.9
POT  : average time = 0.042 speedup = 3.1
OT1D : average time = 0.013 speedup = 1.0

Testing W2, samples with weights
POT  : average time = 0.039 speedup = 3.0
OT1D : average time = 0.013 speedup = 1.0```

Please, contact us by email if you encounter any issues.

Author and maintainer

Stefano Gualandi, stefano.gualandi@gmail.com.

Maintainer: Stefano Gualandi stefano.gualandi@gmail.com

Helm Charts for Apache Drill

Overview

This repository contains a collection of files that can be used to deploy Apache Drill on Kubernetes using Helm Charts. Supports single-node and cluster modes.

What are Helm and Charts?

Helm is a package manager for Kubernetes. Charts are a packaging format in Helm that can simplify deploying Kubernetes applications such as Drill Clusters.

Pre-requisites

A Kubernetes Cluster (this project is tested on GKE clusters)
Helm version 3 or greater
Kubectl version 1.16.0 or greater

Chart Structure

Drill Helm charts are organized as a collection of files inside of the drill directory. As Drill depends on Zookeeper for cluster co-ordination, a zookeeper chart is inside the dependencies directory. The Zookeeper chart follows a similar structure as the Drill chart.

drill/   
  Chart.yaml    # A YAML file with information about the chart
  values.yaml   # The default configuration values for this chart
  charts/       # A directory containing the ZK charts
  templates/    # A directory of templates, when combined with values, will generate valid Kubernetes manifest files

Templates

Helm Charts contain templates which are used to generate Kubernetes manifest files. These are YAML-formatted resource descriptions that Kubernetes can understand. These templates contain ‘variables’, values for which are picked up from the values.yaml file.

Drill Helm Charts contain the following templates:

drill/
  ...
  templates/
    drill-rbac-*.yaml       # To enable RBAC for the Drill app
    drill-service.yaml      # To create a Drill Service
    drill-web-service.yaml  # To expose Drill's Web UI externally using a LoadBalancer. Works on cloud deployments only. 
    drill-statefulset.yaml  # To create a Drill cluster
  charts/
    zookeeper/
      ...
      templates/
        zk-rbac.yaml        # To enable RBAC for the ZK app
        zk-service.yaml     # To create a ZK Service
        zk-statefulset.yaml # To create a ZK cluster. Currently only a single-node ZK (1 replica) is supported

Values

Helm Charts use values.yaml for providing default values to ‘variables’ used in the chart templates. These values may be overridden either by editing the values.yaml file or during helm install. For example, such as the namespace, number of drillbits and more to the template files

Please refer to the values.yaml file for details on default values for Drill Helm Charts.

Usage

Install

Simple Deploy

Drill Helm Charts can be deployed as simply as follows:

# helm install <UNIQUE_NAME> drill/
helm install drill1 drill/

Override Drill Config

Overridding the following two Drill configuration files is currently supported:

drill/conf/drill-env.sh
drill/conf/drill-override.conf

Please edit/replace them as needed. Please do NOT rename/delete.

Once the above configuration files are ready, please create the drill-config-cm configMap to upload them to Kubernetes. When a Drill chart is deployed, the files contained within this configMap will be downloaded to each container and used by the drill-bit process during start-up.

./scripts/createCM.sh

or

kubectl create configmap drill-config-cm --from-file=./drill/conf/drill-override.conf --from-file=./drill/conf/drill-env.sh

Enable config overriding by editing the drillConf section in drill/values.yaml file.

Using Namespaces to Deploy Multple Drill Clusters

Kubernetes Namespaces can be used when more that one Drill Cluster needs to be created. We use the default namespace by default. To create a namespace, use the following command:

# kubectl create namespace <NAMESPACE_NAME>
kubectl create namespace namespace2

This NAMESPACE_NAME needs to be provided in drill/values.yaml. Or can be provided in the helm install command as follows:

# helm install <HELM_INSTALL_RELEASE_NAME> drill/ --set global.namespace=<NAMESPACE_NAME>
helm install drill2 drill/ --set global.namespace=namespace2 --set drill.id=drillcluster2

Note that installing the Drill Helm Chart also installs the dependent Zookeeper chart. So with current design, for each instance of a Drill cluster includes a single-node Zookeeper.

List Pods

$ kubectl get pods
NAME                       READY   STATUS    RESTARTS   AGE
drillcluster1-drillbit-0   1/1     Running   0          51s
drillcluster1-drillbit-1   1/1     Running   0          51s
zk-0                       1/1     Running   0          51s

$ kubectl get pods -n namespace2
NAME                       READY   STATUS    RESTARTS   AGE
drillcluster2-drillbit-0   1/1     Running   0          47s
drillcluster2-drillbit-1   1/1     Running   0          47s
zk-0                       1/1     Running   0          47s

List Services

$ kubectl get services
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)                                  AGE
drill-service           ClusterIP      10.15.242.217   <none>            8047/TCP,31010/TCP,31011/TCP,31012/TCP   3m49s
drillcluster1-web-svc   LoadBalancer   10.15.250.97    34.71.235.149     8047:30019/TCP,31010:32513/TCP           3m49s
zk-service              ClusterIP      10.15.243.254   <none>            2181/TCP,2888/TCP,3888/TCP               3m49s

$ kubectl get services -n namespace2
NAME                    TYPE           CLUSTER-IP      EXTERNAL-IP       PORT(S)                                  AGE
drill-service           ClusterIP      10.15.246.116   <none>            8047/TCP,31010/TCP,31011/TCP,31012/TCP   2m9s
drillcluster2-web-svc   LoadBalancer   10.15.249.214   130.211.220.239   8047:30019/TCP,31010:32513/TCP           2m9s
zk-service              ClusterIP      10.15.246.218   <none>            2181/TCP,2888/TCP,3888/TCP               2m9s

Access Drill Web UI

For cloud based deployments, we create a LoadBalancer type service with an EXTERNAL_IP address. Use this along with the HTTP port to access the Drill Web UI on a browser. Note that the URL is similar to a proxy which internally redirects to the Drill Web UI of any Drill pod.

# http://EXTERNAL_IP:PORT
http://130.211.220.239:8047

Upgrading Drill Charts

Currently only scaling up/down the number of Drill pods is supported as part of Helm Chart upgrades. To resize a Drill Cluster, edit the drill/values.yaml file and apply the changes as below:

# helm upgrade <HELM_INSTALL_RELEASE_NAME> drill/
helm upgrade drill1 drill/

Alternatively, provide the count as a part of the upgrade command:

# helm upgrade <HELM_INSTALL_RELEASE_NAME> drill/ --set drill.count=2
helm upgrade drill1 drill/ --set drill.count=2

If autoscaling is enabled,

# helm upgrade <HELM_INSTALL_RELEASE_NAME> drill/ --set drill.count=<NEW_MIN_COUNT> --set drill.autoscale.maxCount=<NEW_MAX_COUNT>
helm upgrade drill1 drill/ --set drill.count=3 --set drill.autoscale.maxCount=6

Autoscaling Drill Clusters

The size of the Drill cluster (number of Drill Pod replicas / number of drill-bits) can not only be manually scaled up or down as shown above, but can also be autoscaled to simplify cluster management. When enabled, with a higher CPU utilization, more drill-bits are added automatically and as the cluster load goes down, so do the number of drill-bits in the Drill Cluster. The drill-bits deemed excessive gracefully shut down, by going into quiescent mode to permit running queries to complete.

Enable autoscaling by editing the autoscale section in drill/values.yaml file.

Package

Drill Helm Charts can be packaged for distribution as follows:

$ helm package drill/
Successfully packaged chart and saved it to: /Users/agirish/Projects/drill-helm-charts/drill-1.0.0.tgz

Uninstall

Drill Helm Charts can be uninstalled as follows:

# helm [uninstall|delete] <HELM_INSTALL_RELEASE_NAME>
helm delete drill1
helm delete drill2

Note that LoadBalancer and a few other Kubernetes resources may take a while to terminate. Before re-installing Drill Helm Charts, please make sure to wait until all objects from any previous installation (in the same namespace) have terminated.

MQTT Version 3.1.1

mqtt.org

Introduction

Data representations

Bits

Integer data values

16bits
big-endian

UTF-8 encoded String

最初の2byteは文字列の長さを表す。つまり、最大65535bytesまで。
以下のコードは含めてはいけない。serverとclientはこれらを受け取るとコネクションを切断する。
- U+0000,U+D800からU+DFFF
以下のコードは含めるべきではない。serverとclientはコネクションを切断する場合がある。
- U+0001FからU+001F
- U+007FからU+009F
シーケンス「0xEF 0xBB 0xBF」は「U+FEFF」と解釈する。skipしてはいけない。

パケットフォーマット

MQTTパケットの構造

Fixed header: すべてのパケットで使用する。
Variable header: いくつかのパケットで使用する。
Payload: いくつかのパケットで使用する。

Fixed header

1byte: パケットタイプ(4bit) + フラグ(4bit)
2byte: 残りのデータの長さ

パケットタイプ

Position: byte 1, bits 7-4

フラグ

Position: byte 1, bits 3-0
MQTTパケットタイプにより異なる。
Reservedの値もきちんと定義する。
正しくないフラグを受け取った場合、serverとclientはコネクションを切断する。
フラグ
- DUP: PUBLISHパケットの重複配送フラグ
- QoS: PUBLISH Quality of Service
- RETAIN: PUBLISH Retainフラグ

残りのデータの長さ

Position: starts at byte 2
残りのデータの長さとは、パケットの残りのbytes数である。(variable headerとpayloadの合計)
「残りのデータの長さ」の長さ自体は含まない。
残りの長さは、可変長符号化方式を使用して符号化される。
最大127までの値に対して1byte使用する。
各バイトの7bitはデータの符号化に使用する。最上位1bitは、長さを表すために次のbyteを使うことを示すための「continuation bit」として使用する。
「残りのデータの長さ」に使用するbyte数は最大4byte。

Variable header

Packet Identifier

Payload

MQTTパケット

CONNECT

Fixed

Variable

Keep Alive

KeepAliveは秒単位の時間間隔である。
- 16bitで表される。
- クライアントがある制御パケットの送信した後、次の制御パケットを送信するまでに許容できる最大時間間隔である。
- クライアントは、制御パケットの送信間隔がKeepAliveの値を超えないようにする必要がある。
- 特に制御パケットを送信する必要がない場合は、クライアントはPINGREQパケットを送信しなければならない。
- クライアントがPINGREQを送信した後、適切な時間内にPINGRESPを受信できなかった場合、サーバとの接続を切断すべきである。
- クライアントは、ネットワークとサーバが動作していることを確認するために、KeepAlive値に関係なくいつでもPINGREQを送信してよい。
- KeepAlive値が0でなく、サーバがKeepAlive値の1.5倍以内にクライアントからの制御パケットを受信しない場合、クライアントとの接続を切断しなければならない。
KeepAlive値0の場合、KeepAliveのメカニズムをオフにしなければならない。
- オフの場合、サーバは非アクティブなクライアントを切断する必要はない。
サーバはクライアントのKeepAlive値に関わらず、いつでも非アクティブまたは非応答であると判断されたクライアントを切断しても良い。
KeepAlive値は、アプリケーションによって様々である。通常は数分。最大値は18h12m15s。

CONNACK

PUBLISH

Fixed

DUP

Position: byte 1, bit 3
DUP
- 0: このPUBLISHパケットは、serverまたはclientが最初に送信したものである。
- 1: このPUBLISHパケットは、再送信されたものである。
serverまたはclientがPUBLISHパケットを再送信する場合はかならず1にしなければいけない。
QoS0のメッセージの場合は、必ず0にしなければならない。
serverは、受信したPUBLISHパケットのDUPフラグの値を、subscriberに送信する時に伝搬しない。
送信するPUBLISHパケットのDUPフラグの値は、受信したPUBLISHパケットのDUPフラグの値とは独立して設定される。
DUPフラグの値は、PUBLISHパケットが再送信されたものかどうかによってのみ決定されなければならない。

QoS

Position: byte 1, bit 2-1
メッセージ配信の保証レベルを表す。

RETAIN

Variable

Payload

PUBACK

PUBREC

PUBREL

PUBCOMP

SUBSCRIBE

SUBACK

UNSUBSCRIBE

UNSUBACK

PINGREQ

PINGRESP

DISCONNECT

Operational behavior

Storing state

QoS levels and protocol flows

Message delivery retry

Message receipt

Message ordering

クライアントはプロトコルフローを実装する際、次のルールに従わなければならない。
- QoS1または2の場合、PUBLISHパケットを再送信する際は、元のPUBLISHパケットの順序で再送しなければならない。
- QoS1の場合、PUBLISHパケットを受け取った順番で、PUBACKパケットを送信しなければならない。
- QoS2の場合、PUBLISHパケットを受け取った順番で、PUBRECパケットを送信しなければならない。
- QoS2の場合、PUBRECパケットを受け取った順番で、PUBRELパケットを送信しなければならない。
サーバは、デフォルトではTopic単位で順序が保証されていなければならない。順序が保証されない機能をオプションとして提供しても良い。
PUBLISHパケットは(TopicおよびQoSが同じ)サブスクライバに対して、受信した順番で送信しなければならない。