Using The Phi3 SML Locally In Microsoft Fabric For GenAI

Sandeep Pawar | fabric.guru | May 21, 2024

#install onnx runtime
!pip install --pre onnxruntime-genai --q
StatementMeta(, d0e70469-2ecd-45f7-8c14-bfbd81b0a2fc, 3, Finished, Available)
#create a directory for the model
import os 
import shutil
import onnxruntime_genai as og

model_path = '/lakehouse/default/Files/phi3mini'

#Mount a lakehouse first
if not os.path.exists(model_path):
    os.mkdir(model_path)
    print(f"model will be downloaded to {model_path}")
else:
    print(f"{model_path} exists")
StatementMeta(, d0e70469-2ecd-45f7-8c14-bfbd81b0a2fc, 4, Finished, Available)
/lakehouse/default/Files/phi3mini exists
!huggingface-cli download microsoft/Phi-3-mini-4k-instruct-onnx --include cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/* --local-dir ./lakehouse/default/Files/phi3mini
StatementMeta(, d0e70469-2ecd-45f7-8c14-bfbd81b0a2fc, 5, Finished, Available)
Consider using `hf_transfer` for faster downloads. This solution comes with some limitations. See https://huggingface.co/docs/huggingface_hub/hf_transfer for more details.
Fetching 10 files:   0%|                                 | 0/10 [00:00<?, ?it/s]downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/added_tokens.json to /home/trusted-service-user/.cache/huggingface/hub/tmp0gfaqnud
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/genai_config.json to /home/trusted-service-user/.cache/huggingface/hub/tmp3cv861g2
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/config.json to /home/trusted-service-user/.cache/huggingface/hub/tmpalfzx6d_
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/configuration_phi3.py to /home/trusted-service-user/.cache/huggingface/hub/tmpb9yvr3cu
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/tokenizer.json to /home/trusted-service-user/.cache/huggingface/hub/tmpzix7cpop

(…)n-block-32-acc-level-4/genai_config.json: 100%|█| 1.58k/1.58k [00:00<00:00, 1

(…)nt4-rtn-block-32-acc-level-4/config.json: 100%|█| 919/919 [00:00<00:00, 7.27M

(…)ock-32-acc-level-4/configuration_phi3.py: 100%|█| 10.4k/10.4k [00:00<00:00, 7

(…)-rtn-block-32-acc-level-4/tokenizer.json:   0%|  | 0.00/1.84M [00:00<?, ?B/s]downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/tokenizer.model to /home/trusted-service-user/.cache/huggingface/hub/tmp89qgnhsw
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/tokenizer_config.json to /home/trusted-service-user/.cache/huggingface/hub/tmphh83clyh
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/special_tokens_map.json to /home/trusted-service-user/.cache/huggingface/hub/tmpoxvvttlg


(…)ock-32-acc-level-4/tokenizer_config.json: 100%|█| 3.17k/3.17k [00:00<00:00, 2
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx.data to /home/trusted-service-user/.cache/huggingface/hub/tmpq_7hqrsb
downloading https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx/resolve/24fd626412942b0bcd8f16393ef10b69cfc2d162/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4/phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx to /home/trusted-service-user/.cache/huggingface/hub/tmpb_vc1ud8


(…)k-32-acc-level-4/special_tokens_map.json: 100%|█| 568/568 [00:00<00:00, 4.19M


(…)n-block-32-acc-level-4/added_tokens.json: 100%|█| 293/293 [00:00<00:00, 1.60M
Fetching 10 files:  10%|██▌                      | 1/10 [00:00<00:02,  3.29it/s]
(…)-rtn-block-32-acc-level-4/tokenizer.json: 100%|█| 1.84M/1.84M [00:00<00:00, 1

tokenizer.model:   0%|                               | 0.00/500k [00:00<?, ?B/s]

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   0%|  | 0.00/2.72G [00:00<?, ?B/s]


(…)t-cpu-int4-rtn-block-32-acc-level-4.onnx:   0%|   | 0.00/231k [00:00<?, ?B/s]
tokenizer.model: 100%|███████████████████████| 500k/500k [00:00<00:00, 4.86MB/s]
(…)t-cpu-int4-rtn-block-32-acc-level-4.onnx: 100%|█| 231k/231k [00:00<00:00, 3.6
Fetching 10 files:  50%|████████████▌            | 5/10 [00:00<00:00, 10.23it/s]

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   0%| | 10.5M/2.72G [00:00<00:55, 4

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   1%| | 31.5M/2.72G [00:00<00:30, 8

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   2%| | 52.4M/2.72G [00:00<00:25, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   3%| | 73.4M/2.72G [00:00<00:23, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   3%| | 94.4M/2.72G [00:00<00:22, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   4%| | 115M/2.72G [00:01<00:21, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   5%| | 136M/2.72G [00:01<00:20, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   6%| | 157M/2.72G [00:01<00:20, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   7%| | 178M/2.72G [00:01<00:20, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   7%| | 199M/2.72G [00:01<00:20, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   8%| | 220M/2.72G [00:01<00:19, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:   9%| | 241M/2.72G [00:02<00:19, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  10%| | 262M/2.72G [00:02<00:19, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  10%| | 283M/2.72G [00:02<00:19, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  11%| | 304M/2.72G [00:02<00:19, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  12%| | 325M/2.72G [00:02<00:19, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  13%|▏| 346M/2.72G [00:02<00:18, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  13%|▏| 367M/2.72G [00:03<00:18, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  14%|▏| 388M/2.72G [00:03<00:18, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  15%|▏| 409M/2.72G [00:03<00:18, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  16%|▏| 430M/2.72G [00:03<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  17%|▏| 451M/2.72G [00:03<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  17%|▏| 472M/2.72G [00:03<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  18%|▏| 493M/2.72G [00:04<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  19%|▏| 514M/2.72G [00:04<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  20%|▏| 535M/2.72G [00:04<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  20%|▏| 556M/2.72G [00:04<00:17, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  21%|▏| 577M/2.72G [00:04<00:16, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  22%|▏| 598M/2.72G [00:04<00:16, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  23%|▏| 619M/2.72G [00:05<00:16, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  23%|▏| 640M/2.72G [00:05<00:16, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  24%|▏| 661M/2.72G [00:05<00:16, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  25%|▎| 682M/2.72G [00:05<00:16, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  26%|▎| 703M/2.72G [00:05<00:15, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  27%|▎| 724M/2.72G [00:05<00:15, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  27%|▎| 744M/2.72G [00:05<00:15, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  28%|▎| 765M/2.72G [00:06<00:15, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  29%|▎| 786M/2.72G [00:06<00:15, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  30%|▎| 807M/2.72G [00:06<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  30%|▎| 828M/2.72G [00:06<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  31%|▎| 849M/2.72G [00:06<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  32%|▎| 870M/2.72G [00:06<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  33%|▎| 891M/2.72G [00:07<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  34%|▎| 912M/2.72G [00:07<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  34%|▎| 933M/2.72G [00:07<00:14, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  35%|▎| 954M/2.72G [00:07<00:13, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  36%|▎| 975M/2.72G [00:07<00:13, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  37%|▎| 996M/2.72G [00:07<00:13, 12

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  37%|▎| 1.02G/2.72G [00:08<00:13, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  38%|▍| 1.04G/2.72G [00:08<00:13, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  39%|▍| 1.06G/2.72G [00:08<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  40%|▍| 1.08G/2.72G [00:08<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  40%|▍| 1.10G/2.72G [00:08<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  41%|▍| 1.12G/2.72G [00:08<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  42%|▍| 1.14G/2.72G [00:09<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  43%|▍| 1.16G/2.72G [00:09<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  44%|▍| 1.18G/2.72G [00:09<00:12, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  44%|▍| 1.21G/2.72G [00:09<00:11, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  45%|▍| 1.23G/2.72G [00:09<00:11, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  46%|▍| 1.25G/2.72G [00:09<00:11, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  47%|▍| 1.27G/2.72G [00:10<00:11, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  47%|▍| 1.29G/2.72G [00:10<00:11, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  48%|▍| 1.31G/2.72G [00:10<00:11, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  49%|▍| 1.33G/2.72G [00:10<00:10, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  50%|▍| 1.35G/2.72G [00:10<00:10, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  50%|▌| 1.37G/2.72G [00:10<00:10, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  51%|▌| 1.39G/2.72G [00:11<00:10, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  52%|▌| 1.42G/2.72G [00:11<00:10, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  53%|▌| 1.44G/2.72G [00:11<00:10, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  54%|▌| 1.46G/2.72G [00:11<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  54%|▌| 1.48G/2.72G [00:11<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  55%|▌| 1.50G/2.72G [00:11<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  56%|▌| 1.52G/2.72G [00:12<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  57%|▌| 1.54G/2.72G [00:12<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  57%|▌| 1.56G/2.72G [00:12<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  58%|▌| 1.58G/2.72G [00:12<00:09, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  59%|▌| 1.60G/2.72G [00:12<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  60%|▌| 1.63G/2.72G [00:12<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  60%|▌| 1.65G/2.72G [00:13<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  61%|▌| 1.67G/2.72G [00:13<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  62%|▌| 1.69G/2.72G [00:13<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  63%|▋| 1.71G/2.72G [00:13<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  64%|▋| 1.73G/2.72G [00:14<00:12, 7

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  64%|▋| 1.75G/2.72G [00:14<00:11, 8

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  65%|▋| 1.77G/2.72G [00:14<00:09, 9

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  66%|▋| 1.79G/2.72G [00:14<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  67%|▋| 1.81G/2.72G [00:14<00:08, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  67%|▋| 1.84G/2.72G [00:14<00:07, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  68%|▋| 1.86G/2.72G [00:15<00:07, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  69%|▋| 1.88G/2.72G [00:15<00:06, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  70%|▋| 1.90G/2.72G [00:15<00:06, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  70%|▋| 1.92G/2.72G [00:15<00:06, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  71%|▋| 1.94G/2.72G [00:15<00:06, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  72%|▋| 1.96G/2.72G [00:15<00:06, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  73%|▋| 1.98G/2.72G [00:16<00:05, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  74%|▋| 2.00G/2.72G [00:16<00:08, 8

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  74%|▋| 2.02G/2.72G [00:16<00:07, 9

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  75%|▊| 2.04G/2.72G [00:16<00:06, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  76%|▊| 2.07G/2.72G [00:16<00:05, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  77%|▊| 2.09G/2.72G [00:17<00:05, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  77%|▊| 2.11G/2.72G [00:17<00:05, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  78%|▊| 2.13G/2.72G [00:17<00:04, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  79%|▊| 2.15G/2.72G [00:17<00:04, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  80%|▊| 2.17G/2.72G [00:17<00:04, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  80%|▊| 2.19G/2.72G [00:17<00:04, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  81%|▊| 2.21G/2.72G [00:18<00:04, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  82%|▊| 2.23G/2.72G [00:18<00:03, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  83%|▊| 2.25G/2.72G [00:18<00:03, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  84%|▊| 2.28G/2.72G [00:18<00:03, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  84%|▊| 2.30G/2.72G [00:18<00:03, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  85%|▊| 2.32G/2.72G [00:18<00:03, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  86%|▊| 2.34G/2.72G [00:19<00:03, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  87%|▊| 2.36G/2.72G [00:19<00:02, 1

Fetching 10 files:  50%|████████████▌            | 5/10 [00:19<00:00, 10.23it/s]

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  88%|▉| 2.40G/2.72G [00:19<00:02, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  89%|▉| 2.42G/2.72G [00:19<00:02, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  90%|▉| 2.44G/2.72G [00:19<00:02, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  90%|▉| 2.46G/2.72G [00:20<00:02, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  91%|▉| 2.49G/2.72G [00:20<00:01, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  92%|▉| 2.51G/2.72G [00:20<00:01, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  93%|▉| 2.53G/2.72G [00:20<00:01, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  94%|▉| 2.55G/2.72G [00:20<00:01, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  94%|▉| 2.57G/2.72G [00:20<00:01, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  95%|▉| 2.59G/2.72G [00:21<00:01, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  96%|▉| 2.61G/2.72G [00:21<00:00, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  97%|▉| 2.63G/2.72G [00:21<00:00, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  97%|▉| 2.65G/2.72G [00:21<00:00, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  98%|▉| 2.67G/2.72G [00:21<00:00, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data:  99%|▉| 2.69G/2.72G [00:21<00:00, 1

(…)-int4-rtn-block-32-acc-level-4.onnx.data: 100%|█| 2.72G/2.72G [00:22<00:00, 1
Fetching 10 files: 100%|████████████████████████| 10/10 [00:22<00:00,  2.25s/it]
/mnt/var/hadoop/tmp/nm-local-dir/usercache/trusted-service-user/appcache/application_1716331043412_0001/container_1716331043412_0001_01_000001/lakehouse/default/Files/phi3mini
#copy files to a lakehouse from the temp directory
source_dir = os.path.abspath(".") + model_path
destination_dir = model_path

try:
    shutil.copytree(source_dir, destination_dir, dirs_exist_ok=True)
    print("Directory copied successfully.")
except Exception as e:
    print(f"Error occurred: {e}")
StatementMeta(, d0e70469-2ecd-45f7-8c14-bfbd81b0a2fc, 6, Finished, Available)
Directory copied successfully.
#confirm files have been copied
for f in mssparkutils.fs.ls('Files/phi3mini/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4'):
    print(f.name)
StatementMeta(, d0e70469-2ecd-45f7-8c14-bfbd81b0a2fc, 7, Finished, Available)
added_tokens.json
config.json
configuration_phi3.py
genai_config.json
phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx
phi3-mini-4k-instruct-cpu-int4-rtn-block-32-acc-level-4.onnx.data
special_tokens_map.json
tokenizer.json
tokenizer.model
tokenizer_config.json
#initialize the model
model = og.Model(model_path+'/cpu_and_mobile/cpu-int4-rtn-block-32-acc-level-4')

#tokenize
tokenizer = og.Tokenizer(model)
tokenizer_stream = tokenizer.create_stream()

#set params
search_options = {"max_length": 1024,"temperature":0.6}
params = og.GeneratorParams(model)
params.try_graph_capture_with_max_batch_size(1)
params.set_search_options(**search_options)

#provide instructions and prompts
instruction = "You are a renowed poet known for his playful poems"
prompt = " Write a poem to convince someone why pineapple on pizza is awesome"
prompt = f"<|system|>{instruction}<|end|><|user|>{prompt}<|end|><|assistant|>"

#tokenize prompt
input_tokens = tokenizer.encode(prompt)
params.input_ids = input_tokens

#generate response
generator = og.Generator(model, params)
while not generator.is_done():
                generator.compute_logits()
                generator.generate_next_token()

                new_token = generator.get_next_tokens()[0]
                print(tokenizer_stream.decode(new_token), end='', flush=True)
StatementMeta(, d0e70469-2ecd-45f7-8c14-bfbd81b0a2fc, 9, Finished, Available)
 In a land where flavors collide,
A culinary adventure, side by side,
A tale of tastes, a sweet and savory ride,
A pineapple on pizza, a delightful stride.

A golden crown of fruit, a tropical sight,
Glistening under the kitchen's warm light,
A slice of paradise, a flavor so bright,
Pineapple on pizza, a taste bud' fearsome fight.

Sweet and tangy, a flavor explosion,
A symphony of tastes, a flavor fusion,
A pizza slice, a pineapple's illusion,
A culinary masterpiece, a gastronomic revolution.

Crispy crust, a foundation so firm,
A bed for the fruit, a taste to affirm,
A pineapple's sweetness, a flavor to confirm,
A pizza slice, a culinary term.

Skeptics may scoff, critics may jeer,
But taste buds will sing, and hearts will cheer,
For the pineapple on pizza, a flavor so clear,
A delicious delight, a culinary pioneer.

So here's to the pineapple, a fruit so bold,
On a pizza slice, a story untold,
A taste so unique, a flavor to behold,
Pineapple on pizza, a culinary gold.

So come, take a bite, let your taste buds dance,
In this flavorful world, give it a chance,
Pineapple on pizza, a culinary romance,
A taste so divine, a flavor so grand.