Pulse · microsoft/onnxruntime · GitHub

July 4, 2024 – July 11, 2024

Overview

54 Active pull requests

44 Active issues

32 Pull requests merged by 24 people

Migraphx ep windows build
#21284 merged Jul 12, 2024
[VitisAI] custom op support multiple outputs
#21280 merged Jul 11, 2024
Implement FlashAttention for CPU
#20805 merged Jul 11, 2024
Minor updates to Java docs
#21269 merged Jul 11, 2024
Enable Android CI build stages to run in parallel.
#21314 merged Jul 11, 2024
Move QNN nuget package stages out of the big Nuget packaging pipeline.
#21306 merged Jul 11, 2024
Fix typos - 1st Wave
#21278 merged Jul 11, 2024
Fix lint C++ actions
#21303 merged Jul 11, 2024
Enable LTO for Android build
#21243 merged Jul 11, 2024
[DirectML] Broadcast NC-dims for Tensors A&B in DynamicQuantizeMatMul
#21298 merged Jul 11, 2024
[MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation
#21193 merged Jul 10, 2024
Update absl
#21300 merged Jul 10, 2024
[QNN EP] Initial INT4 support
#21171 merged Jul 10, 2024
[Fix] InterOpNumThreads Session Option for ONNX ReactNative Package
#21263 merged Jul 10, 2024
[build] allow MPI on Unix when NCCL is disabled
#21175 merged Jul 10, 2024
[ROCm] fix: obtain AMD GPU memory info through rocm_smi library
#21190 merged Jul 10, 2024
[VSINPU]Code improvement && Slice/Dropout OP support
#21217 merged Jul 10, 2024
[Build] Propagate build option for CUDA minimal to TRT
#20695 merged Jul 9, 2024
[NNAPI EP] Track skipped initializer usage
#21286 merged Jul 9, 2024
Fix ETW Sink Initialize unproperly locking
#21226 merged Jul 9, 2024
Update OpenVino CI Ubuntu to 22.04
#21127 merged Jul 9, 2024
[WebNN EP] Release WebNN MLGraphBuilder after Compile to free memory
#21200 merged Jul 9, 2024
Remove core/common/gsl.h
#20894 merged Jul 9, 2024
Added requested install instructions to ORT ROCm Python.
#21124 merged Jul 8, 2024
[js/webnn] Enable user-supplied MLContext
#20600 merged Jul 8, 2024
[WebNN EP] Remove constraint for conv ops on CPU backend
#21237 merged Jul 8, 2024
[vitisai] Fix build failure introduced by #20920
#21247 merged Jul 8, 2024
Java API Docs for GenerateAPI
#21125 merged Jul 5, 2024
Add MatMulNBits shape infer to SymbolicShapeInference
#21246 merged Jul 5, 2024
Fix typo in genai build DML from source steps
#21268 merged Jul 5, 2024
[Fix Bug] Fp8*Fp8 Run Error
#20911 merged Jul 5, 2024
Use cuda memset async
#21216 merged Jul 5, 2024

22 Pull requests opened by 19 people

Use the scalar MlasSgemm CopyPackB and TransposePackB implementation for RISCV
#21261 opened Jul 5, 2024
[Fix] Exception in iosDynamicFramework Post-Merge workflow
#21262 opened Jul 5, 2024
Csharp: Add CopyOutputsToCpu API
#21274 opened Jul 7, 2024
Fix a build error when CUDA is enabled and onnxruntime_DISABLE_CONTRIB_OPS is ON
#21285 opened Jul 8, 2024
[WebNN EP] Support ConvTranspose for TFLite backend
#21291 opened Jul 9, 2024
[WebNN EP] ConvTranspose should calculate the pads or output shape
#21292 opened Jul 9, 2024
[VitisAI] fix graph save
#21293 opened Jul 9, 2024
change ci docker image to rocm6.1
#21296 opened Jul 9, 2024
[WebNN EP] Enable IO Bindings with MLBuffer
#21301 opened Jul 9, 2024
Fix ETW Sink Initialize unproperly locking (#21226)
#21302 opened Jul 10, 2024
Fix Android build on Windows
#21304 opened Jul 10, 2024
Allow Memory Efficient Attention Kernel to run when local window size is set
#21310 opened Jul 10, 2024
[js/webgpu] Remove unnecessary initialization of var
#21312 opened Jul 11, 2024
Extend QDQPropagation transformer to handle multiple consumers
#21313 opened Jul 11, 2024
Fix bert profiler bug
#21315 opened Jul 11, 2024
Update DirectML from 1.14.1 to 1.15.0
#21323 opened Jul 11, 2024
Add ML Program support for basic activation ops
#21326 opened Jul 11, 2024
Refactor onnxruntime_fetchcontent_makeavailable cmake function
#21328 opened Jul 11, 2024
Move ReluQuantFusion to Level2 for CPU EP only
#21329 opened Jul 12, 2024
disable symbolic shape inference by default
#21330 opened Jul 12, 2024
Remove shape infer from bridge ort
#21331 opened Jul 12, 2024
Move Gelu and LayerNorm fusion to L1 optimization
#21332 opened Jul 12, 2024

14 Issues closed by 10 people

[Documentation] Typo in tutorials at the top of the official webpage
#21146 closed Jul 11, 2024
[Mobile] Pre-built 1.18.1 lib is missing for onnxruntime-android
#21305 closed Jul 10, 2024
[Build] Cross-compiling ONNX for Android on Windows CMAKE Ninja error
#21242 closed Jul 10, 2024
[Performance] Unexpected prediction for OCR model in Flask multithreading
#21288 closed Jul 10, 2024
[Mobile] Pixel 7a (Google edge tpu) NNAPI runtime exception unordered_map::at: key not found when setting USE_FP16
#21230 closed Jul 9, 2024
[Build] ‘struct onnxruntime::ProviderHostCPU’ has no member named ‘UpsampleBase__AdjustOutputSizeAsPolicy’ when CONTRIB ops are disabled.
#21204 closed Jul 9, 2024
Help needed to export in ONNX
#21282 closed Jul 9, 2024
[Build] update version of "cutlass"
#19891 closed Jul 8, 2024
[Build] Error when build directly from repository
#21266 closed Jul 8, 2024
[Build] support for CPython 3.13.0b1
#20832 closed Jul 8, 2024
[Build] Missing DLL onnxruntime_providers_cuda.dll. Where do you get this dll?
#21256 closed Jul 8, 2024
[Documentation] Do intra_op_num_threads and inter_op_num_threads not correspond to thread_pool_size?
#21252 closed Jul 8, 2024
[TensorRT EP] OOM (RAM) when loading ONNX model
#21219 closed Jul 7, 2024
Issue with performing shape inference using symbolic_shape_infer.py with Phi-3 ONNX Models
#21194 closed Jul 6, 2024

30 Issues opened by 30 people

[Training] [ShapeInferenceError] Dimension could not be inferred: incompatible shapes
#21327 opened Jul 11, 2024
Model saved with offline basic optimizations will not load - ShapeInferenceError
#21325 opened Jul 11, 2024
[Crash] Crash while loading AlibabaNLP/gte-base ONNX model
#21322 opened Jul 11, 2024
Not able to load onnx model multilingual-e5-large
#21321 opened Jul 11, 2024
[Build] RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Non-zero status code returned while running MatMul node. Name:'/MatMul_7' Status Message: /onnxruntime_src/onnxruntime/core/framework/op_kernel.cc:83 virtual OrtValue* onnxruntime::OpKernelContext::OutputMLValue(int, const onnxruntime::TensorShape&) status.IsOK() was false. Shape mismatch attempting to re-use buffer. {1,1,512} != {1,32,512}. Validate usage of dim_value (values should be > 0) and dim_param (all values with the same string should equate to the same size) in shapes in the model.
#21320 opened Jul 11, 2024
QDQ removal around Resize (mode=linear) causes wrong numeric values
#21319 opened Jul 11, 2024
[Web]
#21318 opened Jul 11, 2024
[Build] ModuleNotFoundError: No module named 'onnxruntime.capi'
#21317 opened Jul 11, 2024
[Feature Request]
#21316 opened Jul 11, 2024
ROCm Conv is not thread safe
#21311 opened Jul 10, 2024
NativeMethod failed with NullReferenceException under c# ONNX in Windows_NT
#21309 opened Jul 10, 2024
[Build] breakage with protobuf-27.2, due to API change
#21308 opened Jul 10, 2024
[TensorRT] Caching to a dedicated ONNX file does not work
#21307 opened Jul 10, 2024
[Feature Request] Include and expose protobuf serialization / deserialization in the JS interface (ideally in `onnxruntime-common`)
#21297 opened Jul 9, 2024
windows arm64(Snapdragon(R) X 12-core X1E80100 @ 3.40 GHz) [Feature Request]
#21295 opened Jul 9, 2024
Ensures the contrast between foreground and background colors meets WCAG 2 AA minimum contrast ratio thresholds (.s:nth-child(4))
#21294 opened Jul 9, 2024
Tried to specify the thread pool when creating an OrtEnvironment, but one already exists
#21290 opened Jul 9, 2024
[Mobile] Android/Kotlin/JAVA Multi Threading for Multi models in android app
#21289 opened Jul 9, 2024
[Feature Request] SpaceToDepth & DepthToSpace integer implementations
#21287 opened Jul 8, 2024
[Build] Heap overflow caused by the onnx runtime
#21283 opened Jul 8, 2024
C++ library SONAME without full version number
#21281 opened Jul 8, 2024
[VitisAI] Wrong model path after using std::filesystem::path
#21279 opened Jul 8, 2024
onnxruntime quantization weights not tied
#21277 opened Jul 8, 2024
[TensorRT ExecutionProvider] Cannot infer the model on a GPU device with an ID other than 0
#21276 opened Jul 8, 2024
[Web] Inconsistent results between running onnx model through python and with onnxruntime-web
#21275 opened Jul 7, 2024
CUDA_PATH is set but CUDA wasnt able to be loaded
#21272 opened Jul 6, 2024
[Feature Request] MPS provider
#21271 opened Jul 6, 2024
Import error with onnxruntime-directml 1.18.1
#21270 opened Jul 6, 2024
onnxruntime_perf_test.exe failing starting v1.18.0
#21267 opened Jul 5, 2024
[Build] Problems when use onnxruntime
#21264 opened Jul 5, 2024

68 Unresolved conversations

Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.

Implemenation of IObinding in Mixtral MoE Parity Script
#21153 commented on Jul 12, 2024 • 25 new comments
[Optimizer] DQ + MatMul to MatMulNBits support
#21180 commented on Jul 11, 2024 • 14 new comments
Replace Android CI vmImage: 'MacOS-12' with vmImage: 'ubuntu-latest'
#21172 commented on Jul 11, 2024 • 9 new comments
Added WebNN Intro and Tutorial
#20719 commented on Jul 11, 2024 • 8 new comments
[WIP] Contrib operator for Deformable Multi-Scale Attention
#20184 commented on Jul 8, 2024 • 4 new comments
Replace inline pip install with pip install from requirements*.txt
#21106 commented on Jul 10, 2024 • 2 new comments
Enable AVX NE CONVERT for FP16 to FP32 cast
#21183 commented on Jul 11, 2024 • 2 new comments
Add warning for scale being too small to quantize bias
#21155 commented on Jul 5, 2024 • 2 new comments
Add QNN EP option context_node_name_prefix to set EPContext node name prefix
#21236 commented on Jul 12, 2024 • 1 new comment
[Training] Fix Overflow Handling in Cast Infer for ORTModule.
#21202 commented on Jul 5, 2024 • 1 new comment
Enablement of onnxruntime for AIX and fixing issues related to big-endian platform.
#21133 commented on Jul 9, 2024 • 1 new comment
Adds ATen fallback for scaled_dot_product_attention
#21107 commented on Jul 9, 2024 • 1 new comment
Fix broken GRU tests
#15914 commented on Jul 10, 2024 • 0 new comments
[DML EP] Add BFC allocator
#16634 commented on Jul 10, 2024 • 0 new comments
Update pool to MacOS-13
#17361 commented on Jul 10, 2024 • 0 new comments
[DML EP] Add graph support for concat when some provided inputs are 0-dimension tensors
#17501 commented on Jul 10, 2024 • 0 new comments
Improve DML session initialization time & fix error when unpacking Tensors in DML Backend
#19220 commented on Jul 11, 2024 • 0 new comments
[Performance] Massive Performance slowdown from v1.13.1 -> 1.14.0
#20400 commented on Jul 12, 2024 • 0 new comments
Big endian issue: Graph Transformation Attention Fusion tests are failing
#12921 commented on Jul 11, 2024 • 0 new comments
[Training] Cannot export model for inferencing from session created from buffers
#21152 commented on Jul 11, 2024 • 0 new comments
[Documentation] How Configure CUDA 12.* and cuDNN for GPU with ONNX Runtime and C# on Windows 11
#21212 commented on Jul 11, 2024 • 0 new comments
SIGSEGV on CoreMLExecutionProvider when using dynamic batch
#21227 commented on Jul 11, 2024 • 0 new comments
[Training] ImportError: cannot import name 'PropagateCastOpsStrategy' from 'onnxruntime.capi._pybind_state'
#21233 commented on Jul 11, 2024 • 0 new comments
Disable algo caching in ROCM EP
#19567 commented on Jul 10, 2024 • 0 new comments
Mlas int4 int8 with avx2/512
#20687 commented on Jul 11, 2024 • 0 new comments
VitisAI EP Context Model
#20926 commented on Jul 12, 2024 • 0 new comments
[TensorRT EP] Simplify TRTEP version update logic
#21061 commented on Jul 11, 2024 • 0 new comments
Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator
#21083 commented on Jul 10, 2024 • 0 new comments
Enabling S8S8 and S8U8 handling in QGemm for AVX2 and AVX-VNNI
#21123 commented on Jul 8, 2024 • 0 new comments
Bump socket.io from 4.6.1 to 4.7.5 in /js/web
#21126 commented on Jul 11, 2024 • 0 new comments
Keep QDQ nodes w/ nonpositive scale around MaxPool
#21182 commented on Jul 8, 2024 • 0 new comments
Implementation of set membership
#21222 commented on Jul 8, 2024 • 0 new comments
[js/webgpu] Enable conv+clip fuse on mobilenetv2-12-f16
#21234 commented on Jul 11, 2024 • 0 new comments
[WIP][JS/WegGPU] Initial changes to support wasm64.
#21260 commented on Jul 11, 2024 • 0 new comments
[Bug] Coqui VITS ONNX model can't be statically quantized.
#16738 commented on Jul 9, 2024 • 0 new comments
[Feature Request] Move graph compilation behind higher transformers (graph optimization)
#20915 commented on Jul 9, 2024 • 0 new comments
[Web] WebGPU and WASM Backends Unavailable within Service Worker
#20876 commented on Jul 9, 2024 • 0 new comments
[Build] DLLs in maven build are not digitally signed
#19204 commented on Jul 8, 2024 • 0 new comments
Support Numpy v2.0
#21063 commented on Jul 8, 2024 • 0 new comments
Using ML.Net and ONNX in Alpine Docker gives library load error.
#8162 commented on Jul 8, 2024 • 0 new comments
Stateful/Memory models
#20943 commented on Jul 8, 2024 • 0 new comments
Error in quantize vicuna-7b model from fp16 to int8
#20867 commented on Jul 7, 2024 • 0 new comments
ONNXruntime version 1.18.0
#20877 commented on Jul 7, 2024 • 0 new comments
[Performance] Is my script set to get optimal performance of onnxruntime?
#20945 commented on Jul 7, 2024 • 0 new comments
Mac m1 build android.The compiler doesn't support BFLOAT16!!!
#20948 commented on Jul 7, 2024 • 0 new comments
How can I debug a reproducible error?
#20792 commented on Jul 6, 2024 • 0 new comments
[Documentation] How to run this model on android mobile platform
#20937 commented on Jul 6, 2024 • 0 new comments
[Web] cannot load onnx model in a vite/react project, because of error expected magic word 00 61 73 6d, found 3c 21 44 4f @+0
#19556 commented on Jul 6, 2024 • 0 new comments
[Build] build python wheel fails
#21145 commented on Jul 6, 2024 • 0 new comments
Initialization crash using OnnxRuntime 17.0 (previously working on 16.3)
#21205 commented on Jul 5, 2024 • 0 new comments
How to specify fusion rules with quantization?
#21251 commented on Jul 5, 2024 • 0 new comments
[C#] Enable copying of GPU OrtValue to CPU
#21244 commented on Jul 11, 2024 • 0 new comments
TArray used for broadcast was limited to be within range [0, 8] on onnxruntime 1.16.3
#21254 commented on Jul 11, 2024 • 0 new comments
[Performance] How does onnxruntime run in parallel mode?
#21259 commented on Jul 11, 2024 • 0 new comments
ROCM EP convolution fails due to missing
#19566 commented on Jul 11, 2024 • 0 new comments
[Build] CUDA Illegal Memory Access error when using a custom Triton kernel
#20885 commented on Jul 11, 2024 • 0 new comments
XGBoost incremental training, issue with ONNX Conversion
#18841 commented on Jul 11, 2024 • 0 new comments
[Build] Build python interface for Onnxruntime-qnn on aarch64 Linux
#21203 commented on Jul 11, 2024 • 0 new comments
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented on Jul 10, 2024 • 0 new comments
[Build] how to buid on openharmony?
#20895 commented on Jul 10, 2024 • 0 new comments
[Mobile][Kotlin] OnnxTensor.createTensor from floatBuffer takes up 7 seconds
#16937 commented on Jul 10, 2024 • 0 new comments
[Mobile] Issue with Importing YOLOv8 Pose Model into Unity using Sentis
#21253 commented on Jul 10, 2024 • 0 new comments
[Performance] Severe performance penalty with transformer model and DirectML
#20983 commented on Jul 10, 2024 • 0 new comments
[Build] Float16_t and BFloat16_t compile error
#20564 commented on Jul 10, 2024 • 0 new comments
DML EP takes very long time and not exit compiling
#21255 commented on Jul 10, 2024 • 0 new comments
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 commented on Jul 10, 2024 • 0 new comments
NOT_IMPLEMENTED : Could not find an implementation for ReduceProd(18) node with name 'p2o.ReduceProd.0'
#20693 commented on Jul 9, 2024 • 0 new comments
Incorrect result for converted FP16 model with Conv Op when run on arm64 Linux with onnxruntime >= 1.15.0
#18992 commented on Jul 9, 2024 • 0 new comments