Comparing:
# 1) sklearn
import sklearn
# 2) onnxruntime
import skl2onnx
import onnxruntime
# 3) mlprodict
import mlprodict
VotingRegressor(estimators=[('gb', GradientBoostingRegressor(random_state=1)), ('rf', RandomForestRegressor(random_state=1)), ('lr', LinearRegression())])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
VotingRegressor(estimators=[('gb', GradientBoostingRegressor(random_state=1)), ('rf', RandomForestRegressor(random_state=1)), ('lr', LinearRegression())])
GradientBoostingRegressor(random_state=1)
RandomForestRegressor(random_state=1)
LinearRegression()
Using sklearn's model.predict()
model.predict(one_observation)
array([-118.94052453])
Using onnxruntime
from skl2onnx import to_onnx
from onnxruntime import InferenceSession
onx = to_onnx(model, X_train[:1].astype(numpy.float32),
target_opset=14)
sess = InferenceSession(onx.SerializeToString())
sess.run(None, {'X': one_observation})[0]
array([[-118.94051]], dtype=float32)
We can also save the .onnx
model to disk:
with open("sklearn_model.onnx", "wb") as f:
f.write(onx.SerializeToString())
!ls sklearn_model.onnx
sklearn_model.onnx
100%|███████████████████████████████████████████| 10/10 [00:47<00:00, 4.78s/it]
average | deviation | min_exec | max_exec | repeat | number | size | scikit-learn | onnxruntime | mlprodict | |
---|---|---|---|---|---|---|---|---|---|---|
0 | 0.007386 | 0.000550 | 0.006989 | 0.009626 | 50 | 10 | 1 | 0.007386 | 2.229566e-05 | 0.000151 |
1 | 0.006992 | 0.000191 | 0.006660 | 0.007483 | 50 | 10 | 2 | 0.003496 | 1.496709e-05 | 0.000034 |
2 | 0.006990 | 0.000170 | 0.006744 | 0.007759 | 50 | 10 | 5 | 0.001398 | 9.139367e-06 | 0.000017 |
3 | 0.006981 | 0.000349 | 0.006703 | 0.008546 | 50 | 10 | 10 | 0.000698 | 8.288960e-06 | 0.000013 |
4 | 0.008790 | 0.000216 | 0.008502 | 0.009894 | 50 | 10 | 250 | 0.000035 | 1.145972e-06 | 0.000003 |
5 | 0.010951 | 0.000558 | 0.010300 | 0.013073 | 50 | 10 | 500 | 0.000022 | 1.050886e-06 | 0.000003 |
6 | 0.012817 | 0.000559 | 0.012026 | 0.014479 | 50 | 10 | 750 | 0.000017 | 1.103438e-06 | 0.000003 |
7 | 0.014354 | 0.000265 | 0.014124 | 0.015040 | 10 | 10 | 1000 | 0.000014 | 1.081331e-06 | 0.000003 |
8 | 0.040459 | 0.007147 | 0.037611 | 0.061870 | 10 | 10 | 5000 | 0.000008 | 9.428774e-07 | 0.000003 |
9 | 0.072818 | 0.003606 | 0.066701 | 0.077930 | 5 | 10 | 10000 | 0.000007 | 1.247680e-06 | 0.000003 |
Plot speed differences
Plot speed differences
scikit-learn
is optimized
for training (large batches)scikit-learn
and ONNX runtimes converge
for big batchesBoth use similar implementation,
parallelization and languages (C++
, openmp
).
ONNX in the browser ✨
IFrame('https://dunnkers.com/neural-network-backdoors/', width=1000, height=700)
Built by Jeroen Overschie @ GoDataDriven in 2022
→ inspired by a sklearn-onnx benchmark Notebook