Category Archives: Miscellaneous

Gradle with Jmeter: Integeration

Pre requisite:
You should have a sample project that can be build with Gradle Command.

There is well known plugin available for gradle is Kulya. Here is the URL: https://github.com/kulya/jmeter-gradle-plugin

Step 1:

Add below snippet to your build.gradle file located in your project:

apply plugin: 'jmeter'

    buildscript {
        repositories {
            mavenCentral()
        }
        dependencies {
            classpath "com.github.kulya:jmeter-gradle-plugin:1.3.1-2.6"
        }
    }
jmeterRun.configure {
        log.info("Check Point")
	jmeterTestFiles = [file(System.getProperty('jmxscripts',"Demo.jmx"))]
	TCOUNT = System.getProperty('TCOUNT',"1")
	LCOUNT = System.getProperty('LCOUNT',"1") 
	TIME = System.getProperty('TIME',"1")

	jmeterPropertyFile = file("src/test/jmeter/jmeter.properties")
	jmeterUserProperties = ["SERVER="+System.getProperty('SERVER',"qa.bluejeansdev.com"),"TCOUNT="+TCOUNT,"LCOUNT="+LCOUNT,"TIME="+TIME]
}

Below is the screen shot of above:
Build_Gradle

 

Step 2:

From the command line go to your project root folder:

run gradle jmeterRun

If you want to cross verify whether plugin correctly setup, you can check by putting log.info(“Check point”) in your gradle.build file.

Advance Configuration:

In most cases, you will need to pass user defined variables to jmeter which are known at run time only.

 

Lets first load the jmeter script file that is .jmx file.

 

For that one can make use of jmeterTestFiles parameter.

jmeterTestFiles = [file(System.getProperty('jmxscripts',"testJmeter.jmx"))]

So now, put your script is in your project directory, and now if you enter your gradle jmeterRun , it will load your jmeter scripts.

Hope you are able to run your first script.

If yes, than check your build directory, it will have jmeter folder and jmeter_report. Explore it. It will have all property file and logs in jmeter folder. Jmeter_report will have the final output in form of xml and the html report for fancy user friendly display. Go impress your boss.

Below is the screen shot of my project directory:
Jmeter_Gradle_ProjectDirectory

 

The most useful case is to pass jmeter user defined variable through gradle. For example the basic one would be the Server ip and Port number.

It is very basic requirement of jmeter script to be flexible enough to run on any environment. So, to run jmeter script  many people needs user defined variable to pass  along with the jmeter command line like below:

jmeter -JSERVER=www.snippetexample.com –JPORT=8080 -n  -t Test2.jmx

Above command will run the jmx file setting the value SERVER and PORT in jmx file.

To achieve the same with the gradle, you can pass any parameter same as Jmeter command line with a small change. It uses –D instead of –J. On the eclipse side in your gradle.build file, you can catch those parameter by

System.getProperty(‘Key’, ‘default value in not provided’) .

So by above command you can simple catch any parameter that you want to pass to jmeter.

Now how do you pass these parameters to jmeter. For the given plugin, you can make use of jmeterUserProperties file.

 

You can add as many parameters as you want to add in array and assign it to

jmeterUserProperties variable as shown below

jmeterUserProperties = ["SERVER="+System.getProperty('SERVER',"www.snippetexample.com"),"PORT="+8080]

To sum up every thing, below is the generalized snippet in little more clean way:

SERVER = System.getProperty('SERVER',"www.Snippetexample.com");
PORT = System.getProperty('PORT',"8080")
TCOUNT = System.getProperty('TCOUNT',"1")
LCOUNT = System.getProperty('LCOUNT',"1")
TIME = System.getProperty('TIME',"1")
	
jmeterPropertyFile = file("src/test/jmeter/jmeter.properties")
jmeterUserProperties = ["SERVER="+SERVER,"PORT="+PORT,"TCOUNT="+TCOUNT,"LCOUNT="+LCOUNT,"TIME="+TIME]

Hope you able to run everything smoothly. Below is the screen shot of terminal.

Gradle_CommandLine

 

Please ignore the grammatical mistakes.

Introduction to R and MongoDB

In this article we go through the basics of R and MongoDB.

Introduction to R: 

To start with R, it is statistical programming language. It is an interpreted language so it executes instruction directly rather than first compiling it and than executing it, it directly executes the instruction from the console. R due to its statistical inbuilt and add on packages, is very popular among the statisticians and data miners. On top of all these features, R also provides package to visualize the data in 2D and 3D way to get more clear picture of the data and result for better analysis.

Get Started With R

Download URL:

For R: http://cran.rstudio.com/
R Studio: http://www.rstudio.com/products/rstudio/download/

The very first package that one should know in order to start learning R are:

  • help(function name)  / ?functionName
    • to get help on function whose name is provided
  • example(function name)
    • to get example on function whose name is provided
  • apropos(“function name”)
    • to show all function available whose name contains string provided in parameter

Lets see the sample examples of above commands for clarity:

Below command can be directly written to R console or R Studio.

  • Suppose anyone need help to understand about the function called : min. He/She can make use of help function as shown below.

>  help(min)

Above will pop up the page for the help on function min.

  • Suppose someone still needs to understand how to use the min function. He/She can make use of example function as shown below.

> example(min)
min> require(stats); require(graphics)
min> min(5:1, pi) #-> one number
[1] 1
min> pmin(5:1, pi) #-> 5 numbers
[1] 3.141593 3.141593 3.000000 2.000000 1.000000
min> x <- sort(rnorm(100)); cH <- 1.35
min> pmin(cH, quantile(x)) # no names
[1] -2.4030962 -0.4229549 0.1632506 0.8253795 1.3500000

Suppose someone is unable to find function for month then they can make use of apropos function as shown below.

> apropos(“month”)
[1] “month.abb” “month.name” “monthplot” “months” “months.Date” “months.POSIXt” “sunspot.month”

Data Types supported by R:

For data analysis on data having different types of data types  R provides many data types to cover most of them.

Below are the main data types that are widely used for data mining and machine learning purpose.

  • Vector
  • Matrices
  • Arrays
  • List
  • Factors

Visualization Example:

As mentioned above, R provides many packages to visualize the data. For example, lets take
function persp();

It provides so many ways to visualize the data in different form.

  • persp(volcano, expand = 0.5)

DataDistribution_R_3

 

DataDistribution

DataDistribution_R

DataDistribution_R_1

As discussed above, there are many data types that R has provided.
R has also provided a structure of keeping data in form of tables, it is Data Frame.

Data Frame:

It is list of vectors of equal length. Different type of data can be imported to R and stored into Data Frame. Source can be csv, xls, table, txt etc.

For example below command will load and store the data of data2013.txt kept on local file to sampleDataFrame in R.

> sampleDataFrame <- read.csv(“~/SanJose/HistoricalDataSet/data2013.txt”, header=FALSE)

One more feature that R has provided for a quick view, one can simply take a snap of any data by CTRL+C and import the data to R.

x <- read.table(file = “clipboard”, sep=”\t”, header=TRUE)

Database Integration:

In real world scenario, most data are stored in the RDBMS. R has provided interface to connect to them easily. R has also provided interface for No SQL data base like MongoDB which is in most demand for the BigData analysis and mining.

R has provided RODBC, RMySQL, ROracle, RJDBC interfaces to integrate with relation data base, and it has also provided RMongo for the MongoDB (No SQL database), RNeo4j for Neo4j (Graph Data base).

It is very easy to use these interfaces in R.

Database Integration with MongoDB:

Consider an example where one need data from contacts collection of users database in MongoDB.

Steps to import data from MongoDB to R:

R_Mongo_integration

 

 

 

 

 

 

 

 

Classification Example in R:

As data analytic or data miner, requirement of classification and clustering comes very often, and as R has very rich packages, there are many packages in R is provided for the same.

  • Dimensionality Reduction
  • Frequent Pattern Mining
  • Sequence Mining
  • Clustering
  • Classification

Same goes for any specific problem for above. For example suppose anyone wants to do a SVM classification, there are e1072, kernlab, klaR, svmpath, shogun packages available to achieve same.

Lets take an example with e1072 package.

R_PackageInstalled

 

 

 

 

 

 

 

sampleDataLoading

 

 

 

 

 

 

 

As shown above, we have loaded the package of e1072 and also the sample data of cats in the R.

Now in order to do the classification, we need to create a model from the available data set.

SVM_R_EXAMPLE

 

 

 

 

 

 

 

 

 

 

 

 

 

 

To visualize the above model:

> plot(model,cats)

SVM_Classification_Plot

 

 

 

 

 

 

 

For a classification problem, will need a test and training data set.

  • Divide data into training set and test set

index <- 1:nrow(cats)
testindex <- sample(index, trunc(length(index)/3))
testset <- cats[testindex,]
trainset <- cats[-testindex,]
trainset <- cats[-testindex,]

  • Train Model

model <- svm(Sex~., data = trainset)
prediction <- predict(model, testset[,-1])

To verify the result there are many packages available like Gain and Lift Charts, K-S or Kolmogorov-Smirnov chart, ROC Chart, Area Under the Curve etc.

Below is the confusion matrix.

tab <- table(pred = prediction, true = testset[,1])

SVM_ConfusionMatrix

As shown above, we have correctly classified 37 instances and 11 instances are wrongly classified.

So concluding, R is very rich language to use as it has wide range of packages available for data modeling, analysis, and visualization.

Basic introduction to MongoDB:

It is an Document Database which is not tightly bounded with schema, so it is well known for its features like Schema less, Clear Structure. Considering large amount of data, MongoDB is proven to be deliver high Performance, high Availability and easy scalability.

Data Format in MongoDB:

It stores data (documents) in BSON format which is binary-encoded serialization of JSON. One documents has a size limit of 16 MB. Below is the documents format that is maintained in MongoDB.

MongoDocumentFormat

  • MongoDB is easy to install.
  • Run mongod.exe from command prompt to start database
  • Run mongo.exe from command prompt to connect and manipulate data

 Comparision with RDBMS

SQL_MongoDB

 

 

 

 

 

 

 

Query Categories

QueryStructures

 

 

 

 

 

Importing data from files to MongoDB.

  • Easily import data from CSV, JSON
  • Example: mongoimport –db users –collection contacts –type csv –headerline –file /opt/backups/contacts.csv

Sample Queries

Insert Queries

MongoCMD

 

 

 

 

 

 

 

 

 

Projection

  • To limit the number of data
  • To limit the number of row
  • To apply conditions limit check, sorting etc.

Projection

 

 

 

 

 

 

 

 

 

 

 

 

Applying limit: Data<10

LimitExample

 

 

 

 

 

 

 

 

 

 

 

Find Using REGEX

MongoDB_REGEX

 

 

 

 

 

 

Skip, Sort

Sort_Skip_Mongo

 

 

 

 

 

Update multiple Document using REGEX

MongoDB_Update_REGEX

 

 

 

 

 

 

 

Remove Data

MongoDB_Remove

 

 

 

 

 

 

 

 

 

Concluding MongoDB by mentioning its very useful feature that is MongoDB has provided MapReduce Support as well.

MongoMapReduce

 

 

 

 

 

 

 

 

 

 

 

 

Thus, R is an enriched statistical programming language with many predefined easy to use packages and MongoDB is schema free highly scalable database available mostly suited for big data mining.

Neo4j Samples (Simple and Complex Queries)

In this post we will go through examples of neo4j simple and complex queries.
We will go through some neo4j samples and some scenarios. Lets begin with the basics.
Create Nodes with properties:


Query To Create Simple Nodes:

> CREATE (n:Actor { name : "Keanu Reeves", Age : 50, 
Gender:"Male" , isAlive : true });

> CREATE (n:Actor { name : "Drew Barrymore", Age : 39, 
Gender:"Female" ,isAlive : true });

> CREATE (n:Actor { name : "Paris Hilton", Age: 33, 
Gender:"Female" , isAlive : true });

Above query will create nodes of type Actor.

 

Query To add new property to existing Node:


> MATCH (actor:Actor) SET actor.friendsCount = 0 RETURN actor;

Above query will simply add property friendsCount to all Actor nodes.

 

Create Relationship between Nodes in Neo4j:


Assuming that you all are familiar with Twitter and Instagram where you follow people of your choice, the relationship “follows” ( i.e. for e.g one actor following other actor) has properties

FollowingStatus: (pending, approve, rejected)

Let take an example, Drew Barrymore starts following Paris Hilton, so she just sends a request.

>  MATCH (actor:Actor),(actor2:Actor) 
WHERE actor.name = "Drew Barrymore" and actor2.name = "Paris Hilton" 
CREATE (actor)-[rel:FOLLOW{status:"pending"}]->(actor2) RETURN rel;

Above query can get you in trouble because whenever you execute it, it will create multiple relations between same node, and can cause confusion.  To solve this we can use UNIQUE keyword as shown below.

> MATCH (actor:Actor),(actor2:Actor) 
WHERE actor.name = "Drew Barrymore" and actor2.name = "Paris Hilton" 
CREATE UNIQUE (actor)-[rel:FOLLOW{status:"pending"}]->(actor2) RETURN rel;

Suppose, on top of this you also want to log the information like whenever you try to update the same relation you just want to log a timestamp or a flag. For that we can use the below query.

> MATCH (actor:Actor),(actor2:Actor) 
WHERE actor.name = "Drew Barrymore" AND actor2.name = "Paris Hilton" 
MERGE (actor)-[rel:FOLLOW{status:"pending"}]->(actor2) 
ON CREATE SET rel.createTime=140123120, rel.updateTime=140123120 
ON MATCH SET rel.updateTime=140123120 
RETURN rel;

Suppose, if you want to update the status value only, considering that Paris Hilton has accepted here follow request.

> Match ((actor:Actor)-[follow:FOLLOW {status:"pending"}]->(actor2:Actor)) 
WHERE actor.name = "Drew Barrymore" and actor2.name = "Paris Hilton" 
SET follow.status = "accepted", actor.friendsCount = coalesce(actor.friendsCount,0) + 1, 
actor2.friendsCount = coalesce(actor2.friendsCount,0) + 1  RETURN follow;

Above query will update the Follow request status property from pending to accepted if and only if it is pending. And at the same time will also update the friendscount of both the node.
One can also make use of CASE feature of Neo4j as seen below.

 

Conditional update:


 > Match ((actor:Actor)-[follow:FOLLOW]->(actor2:Actor)) 
WHERE actor.name = "Drew Barrymore" and actor2.name = "Paris Hilton" 
SET follow.status = 
CASE WHEN follow.status = "pending" 
THEN "accepted" ELSE follow.status END RETURN follow;

The above query works the same way as the previous one, but there are many scenarios in which the above query can help you like adding multiple case scenarios.
Note: Don’t forget to put END after case.

 

Data Retrieval in Neo4j


Using above scenario, lets create
nodes of Movie with property “name”( name of movie) and “likes”(number of likes).

Actor can have LIKE relation with node Movie. This scenario can be represented using the Figure below.

SampleData

 

Blue nodes are Actors and Orange ones are Movies.

37 (Actor Node): Drew Barrymore
46 (Actor Node): Paris Hilton
67 (Movie Node): ET

rest of nodes are other movies that are liked by Paris Hilton.

 

 

 

 

Basic Retrieval of Nodes by its Property:


 > MATCH (n:Actor {Age:39}) return n;

We can clearly see that the output of the above query will be the node of Drew Barrymore.

> MATCH (n:Actor) where n.Age > 19  return n;

One more example. This query will output all the actor nodes with Age greater than 19.

Normally, you won’t find any trouble in above query, but in real word, there will be a need of some complex retrieval combining properties and node status.

 

Complex Retrieval of Nodes by its Property:


Suppose, you want to suggest your actor some moview.

In our data lets assume that Drew Barrymore is following (Approved) Paris Hilton.
Drew Barrymore has already liked “ET” movie and Paris Hilton has liked “ET”, “Grown Ups”, “The Grudge”, “House Of Wax”.

Now, you want data to be like,

NAME_OF_MOVIE          |              WATCHED           |              NO_OF_LIKES
____________________________________________­­­­­­­­­­­­­­­­­­­_______________

Grown Ups                                 NOT WATCHED                                10

Matrix                                          NOT WATCHED                                3

The Grudge                                 NOT WATCHED                                 2

ET                                                 WATCHED                                         1

House Of Wax                            NOT WATCHED                                  1

To fetch data as shown above, one can use example query like the following.

> MATCH (actor1:Actor)-[follow:FOLLOW]->(actor2:Actor)-[like1:LIKE]->(movie:Movie) 
with actor1,movie 
OPTIONAL MATCH (actor1)-[like2:LIKE]->(movie) 
with actor1, movie, like2 
WHERE actor1.name = "Drew Barrymore" 
RETURN DISTINCT movie.name AS NAME_OF_MOVIE, 
CASE WHEN like2 IS NULL THEN "NOT WATCHED" ELSE "WATCHED" END AS WATCHED, 
movie.likes AS NO_OF_LIKE ORDER BY movie.likes DESC

Output of which is,

OutputOfNeo4jRetrieval

 

Thank you for reading and please feel free to comment if you encounter any difficulties or specific scenarios.

Also if you like to set up High Availability Cluster of Neo4j for your organization or personal use please have a look at http://neo4j.com/docs/stable/ha-how.html.

How to Add new row to DataFrame in R

Suppose your dataSet is Cats, given in the MASS library.

If you want to load then type following commands in R console

>library(MASS)

>Cats

So you have loaded dataset, Cats which is similar to below one

Sex Bwt Hwt
1 F 2.0 7.0
2 F 2.0 7.4
3 F 2.0 9.5

Now if you want to add new Row to existing one.
1) First create new row for dataFrame

> newRow <- data.frame(Sex=’F’,Bwt=2.1,Hwt=8.1)

>Cats <- rbind(Cats,newRow)

Thats it!!!!!

 

Type
>Cats

You will see new row appended to the existing data