007 Machine Learning - Logistic Regression
What is Logistic Regression?
Learn True/False decision model with large amount of data points.
http://bit.ly/2Mvj4ro : Toward Data Science - Logistic Regression — Detailed Overview, Saishruthi Swaminathan, 2018 This demo:
Detect a small number of frauds in a large sets of transactions
Input data set:
284,807 records with 492 frauds (= 0.012%)
Each record consists of 30 floating parameters and one boolean (True = Fraud)
Demo
Randomly select 90% as learning data, 10% for inference input and compare detection and expected result
https://gyazo.com/e03f55177ef54c31beec5f2c70db2a2a
Its source code consists of two sections:
Read data and create array for training: 15 sec in normal spark
Logistic Regression with SGD: 1 min 17 sec = 77 sec in nomral Spark
Inference acculacy is ~93%
Accelerate the latter pane with Frovedis
val model = LogisticRegressionWithSGD.train(ovs_training,1000)
9.4 sec on Spark with Frovedis -- 8.2 times speedup
Improved to 27 - 50 times speedup ( 40 - 77 sec in Normal Spark, 1.5 sec on Spark with the latest Frovedis).
https://gyazo.com/e3fc80a23735f5ec59915906146c3511