-
Notifications
You must be signed in to change notification settings - Fork 2.8k
Insights: microsoft/onnxruntime
Overview
Could not load contribution data
Please try again later
32 Pull requests merged by 24 people
-
Migraphx ep windows build
#21284 merged
Jul 12, 2024 -
[VitisAI] custom op support multiple outputs
#21280 merged
Jul 11, 2024 -
Implement FlashAttention for CPU
#20805 merged
Jul 11, 2024 -
Minor updates to Java docs
#21269 merged
Jul 11, 2024 -
Enable Android CI build stages to run in parallel.
#21314 merged
Jul 11, 2024 -
Move QNN nuget package stages out of the big Nuget packaging pipeline.
#21306 merged
Jul 11, 2024 -
Fix typos - 1st Wave
#21278 merged
Jul 11, 2024 -
Fix lint C++ actions
#21303 merged
Jul 11, 2024 -
Enable LTO for Android build
#21243 merged
Jul 11, 2024 -
[DirectML] Broadcast NC-dims for Tensors A&B in DynamicQuantizeMatMul
#21298 merged
Jul 11, 2024 -
[MLAS] AArch64 SQNBitGemm CompInt8 initial multi-row implementation
#21193 merged
Jul 10, 2024 -
Update absl
#21300 merged
Jul 10, 2024 -
[QNN EP] Initial INT4 support
#21171 merged
Jul 10, 2024 -
[Fix] InterOpNumThreads Session Option for ONNX ReactNative Package
#21263 merged
Jul 10, 2024 -
[build] allow MPI on Unix when NCCL is disabled
#21175 merged
Jul 10, 2024 -
[ROCm] fix: obtain AMD GPU memory info through rocm_smi library
#21190 merged
Jul 10, 2024 -
[VSINPU]Code improvement && Slice/Dropout OP support
#21217 merged
Jul 10, 2024 -
[Build] Propagate build option for CUDA minimal to TRT
#20695 merged
Jul 9, 2024 -
[NNAPI EP] Track skipped initializer usage
#21286 merged
Jul 9, 2024 -
Fix ETW Sink Initialize unproperly locking
#21226 merged
Jul 9, 2024 -
Update OpenVino CI Ubuntu to 22.04
#21127 merged
Jul 9, 2024 -
[WebNN EP] Release WebNN MLGraphBuilder after Compile to free memory
#21200 merged
Jul 9, 2024 -
Remove core/common/gsl.h
#20894 merged
Jul 9, 2024 -
Added requested install instructions to ORT ROCm Python.
#21124 merged
Jul 8, 2024 -
[js/webnn] Enable user-supplied MLContext
#20600 merged
Jul 8, 2024 -
[WebNN EP] Remove constraint for conv ops on CPU backend
#21237 merged
Jul 8, 2024 -
[vitisai] Fix build failure introduced by #20920
#21247 merged
Jul 8, 2024 -
Java API Docs for GenerateAPI
#21125 merged
Jul 5, 2024 -
Add MatMulNBits shape infer to SymbolicShapeInference
#21246 merged
Jul 5, 2024 -
Fix typo in genai build DML from source steps
#21268 merged
Jul 5, 2024 -
[Fix Bug] Fp8*Fp8 Run Error
#20911 merged
Jul 5, 2024 -
Use cuda memset async
#21216 merged
Jul 5, 2024
22 Pull requests opened by 19 people
-
Use the scalar MlasSgemm CopyPackB and TransposePackB implementation for RISCV
#21261 opened
Jul 5, 2024 -
[Fix] Exception in iosDynamicFramework Post-Merge workflow
#21262 opened
Jul 5, 2024 -
Csharp: Add CopyOutputsToCpu API
#21274 opened
Jul 7, 2024 -
Fix a build error when CUDA is enabled and onnxruntime_DISABLE_CONTRIB_OPS is ON
#21285 opened
Jul 8, 2024 -
[WebNN EP] Support ConvTranspose for TFLite backend
#21291 opened
Jul 9, 2024 -
[WebNN EP] ConvTranspose should calculate the pads or output shape
#21292 opened
Jul 9, 2024 -
[VitisAI] fix graph save
#21293 opened
Jul 9, 2024 -
change ci docker image to rocm6.1
#21296 opened
Jul 9, 2024 -
[WebNN EP] Enable IO Bindings with MLBuffer
#21301 opened
Jul 9, 2024 -
Fix ETW Sink Initialize unproperly locking (#21226)
#21302 opened
Jul 10, 2024 -
Fix Android build on Windows
#21304 opened
Jul 10, 2024 -
Allow Memory Efficient Attention Kernel to run when local window size is set
#21310 opened
Jul 10, 2024 -
[js/webgpu] Remove unnecessary initialization of var
#21312 opened
Jul 11, 2024 -
Extend QDQPropagation transformer to handle multiple consumers
#21313 opened
Jul 11, 2024 -
Fix bert profiler bug
#21315 opened
Jul 11, 2024 -
Update DirectML from 1.14.1 to 1.15.0
#21323 opened
Jul 11, 2024 -
Add ML Program support for basic activation ops
#21326 opened
Jul 11, 2024 -
Refactor onnxruntime_fetchcontent_makeavailable cmake function
#21328 opened
Jul 11, 2024 -
Move ReluQuantFusion to Level2 for CPU EP only
#21329 opened
Jul 12, 2024 -
disable symbolic shape inference by default
#21330 opened
Jul 12, 2024 -
Remove shape infer from bridge ort
#21331 opened
Jul 12, 2024 -
Move Gelu and LayerNorm fusion to L1 optimization
#21332 opened
Jul 12, 2024
14 Issues closed by 10 people
-
[Documentation] Typo in tutorials at the top of the official webpage
#21146 closed
Jul 11, 2024 -
[Mobile] Pre-built 1.18.1 lib is missing for onnxruntime-android
#21305 closed
Jul 10, 2024 -
[Build] Cross-compiling ONNX for Android on Windows CMAKE Ninja error
#21242 closed
Jul 10, 2024 -
[Performance] Unexpected prediction for OCR model in Flask multithreading
#21288 closed
Jul 10, 2024 -
Help needed to export in ONNX
#21282 closed
Jul 9, 2024 -
[Build] update version of "cutlass"
#19891 closed
Jul 8, 2024 -
[Build] Error when build directly from repository
#21266 closed
Jul 8, 2024 -
[Build] support for CPython 3.13.0b1
#20832 closed
Jul 8, 2024 -
[Build] Missing DLL onnxruntime_providers_cuda.dll. Where do you get this dll?
#21256 closed
Jul 8, 2024 -
[Documentation] Do intra_op_num_threads and inter_op_num_threads not correspond to thread_pool_size?
#21252 closed
Jul 8, 2024 -
[TensorRT EP] OOM (RAM) when loading ONNX model
#21219 closed
Jul 7, 2024 -
Issue with performing shape inference using symbolic_shape_infer.py with Phi-3 ONNX Models
#21194 closed
Jul 6, 2024
30 Issues opened by 30 people
-
[Training] [ShapeInferenceError] Dimension could not be inferred: incompatible shapes
#21327 opened
Jul 11, 2024 -
Model saved with offline basic optimizations will not load - ShapeInferenceError
#21325 opened
Jul 11, 2024 -
[Crash] Crash while loading AlibabaNLP/gte-base ONNX model
#21322 opened
Jul 11, 2024 -
Not able to load onnx model multilingual-e5-large
#21321 opened
Jul 11, 2024 -
QDQ removal around Resize (mode=linear) causes wrong numeric values
#21319 opened
Jul 11, 2024 -
[Web]
#21318 opened
Jul 11, 2024 -
[Build] ModuleNotFoundError: No module named 'onnxruntime.capi'
#21317 opened
Jul 11, 2024 -
[Feature Request]
#21316 opened
Jul 11, 2024 -
ROCm Conv is not thread safe
#21311 opened
Jul 10, 2024 -
NativeMethod failed with NullReferenceException under c# ONNX in Windows_NT
#21309 opened
Jul 10, 2024 -
[Build] breakage with protobuf-27.2, due to API change
#21308 opened
Jul 10, 2024 -
[TensorRT] Caching to a dedicated ONNX file does not work
#21307 opened
Jul 10, 2024 -
windows arm64(Snapdragon(R) X 12-core X1E80100 @ 3.40 GHz) [Feature Request]
#21295 opened
Jul 9, 2024 -
Tried to specify the thread pool when creating an OrtEnvironment, but one already exists
#21290 opened
Jul 9, 2024 -
[Mobile] Android/Kotlin/JAVA Multi Threading for Multi models in android app
#21289 opened
Jul 9, 2024 -
[Feature Request] SpaceToDepth & DepthToSpace integer implementations
#21287 opened
Jul 8, 2024 -
[Build] Heap overflow caused by the onnx runtime
#21283 opened
Jul 8, 2024 -
C++ library SONAME without full version number
#21281 opened
Jul 8, 2024 -
[VitisAI] Wrong model path after using std::filesystem::path
#21279 opened
Jul 8, 2024 -
onnxruntime quantization weights not tied
#21277 opened
Jul 8, 2024 -
[TensorRT ExecutionProvider] Cannot infer the model on a GPU device with an ID other than 0
#21276 opened
Jul 8, 2024 -
[Web] Inconsistent results between running onnx model through python and with onnxruntime-web
#21275 opened
Jul 7, 2024 -
CUDA_PATH is set but CUDA wasnt able to be loaded
#21272 opened
Jul 6, 2024 -
[Feature Request] MPS provider
#21271 opened
Jul 6, 2024 -
Import error with onnxruntime-directml 1.18.1
#21270 opened
Jul 6, 2024 -
onnxruntime_perf_test.exe failing starting v1.18.0
#21267 opened
Jul 5, 2024 -
[Build] Problems when use onnxruntime
#21264 opened
Jul 5, 2024
68 Unresolved conversations
Sometimes conversations happen on old items that aren’t yet closed. Here is a list of all the Issues and Pull Requests with unresolved conversations.
-
Implemenation of IObinding in Mixtral MoE Parity Script
#21153 commented on
Jul 12, 2024 • 25 new comments -
[Optimizer] DQ + MatMul to MatMulNBits support
#21180 commented on
Jul 11, 2024 • 14 new comments -
Replace Android CI vmImage: 'MacOS-12' with vmImage: 'ubuntu-latest'
#21172 commented on
Jul 11, 2024 • 9 new comments -
Added WebNN Intro and Tutorial
#20719 commented on
Jul 11, 2024 • 8 new comments -
[WIP] Contrib operator for Deformable Multi-Scale Attention
#20184 commented on
Jul 8, 2024 • 4 new comments -
Replace inline pip install with pip install from requirements*.txt
#21106 commented on
Jul 10, 2024 • 2 new comments -
Enable AVX NE CONVERT for FP16 to FP32 cast
#21183 commented on
Jul 11, 2024 • 2 new comments -
Add warning for scale being too small to quantize bias
#21155 commented on
Jul 5, 2024 • 2 new comments -
Add QNN EP option context_node_name_prefix to set EPContext node name prefix
#21236 commented on
Jul 12, 2024 • 1 new comment -
[Training] Fix Overflow Handling in Cast Infer for ORTModule.
#21202 commented on
Jul 5, 2024 • 1 new comment -
Enablement of onnxruntime for AIX and fixing issues related to big-endian platform.
#21133 commented on
Jul 9, 2024 • 1 new comment -
Adds ATen fallback for scaled_dot_product_attention
#21107 commented on
Jul 9, 2024 • 1 new comment -
Fix broken GRU tests
#15914 commented on
Jul 10, 2024 • 0 new comments -
[DML EP] Add BFC allocator
#16634 commented on
Jul 10, 2024 • 0 new comments -
Update pool to MacOS-13
#17361 commented on
Jul 10, 2024 • 0 new comments -
[DML EP] Add graph support for concat when some provided inputs are 0-dimension tensors
#17501 commented on
Jul 10, 2024 • 0 new comments -
Improve DML session initialization time & fix error when unpacking Tensors in DML Backend
#19220 commented on
Jul 11, 2024 • 0 new comments -
[Performance] Massive Performance slowdown from v1.13.1 -> 1.14.0
#20400 commented on
Jul 12, 2024 • 0 new comments -
Big endian issue: Graph Transformation Attention Fusion tests are failing
#12921 commented on
Jul 11, 2024 • 0 new comments -
[Training] Cannot export model for inferencing from session created from buffers
#21152 commented on
Jul 11, 2024 • 0 new comments -
[Documentation] How Configure CUDA 12.* and cuDNN for GPU with ONNX Runtime and C# on Windows 11
#21212 commented on
Jul 11, 2024 • 0 new comments -
SIGSEGV on CoreMLExecutionProvider when using dynamic batch
#21227 commented on
Jul 11, 2024 • 0 new comments -
[Training] ImportError: cannot import name 'PropagateCastOpsStrategy' from 'onnxruntime.capi._pybind_state'
#21233 commented on
Jul 11, 2024 • 0 new comments -
Disable algo caching in ROCM EP
#19567 commented on
Jul 10, 2024 • 0 new comments -
Mlas int4 int8 with avx2/512
#20687 commented on
Jul 11, 2024 • 0 new comments -
VitisAI EP Context Model
#20926 commented on
Jul 12, 2024 • 0 new comments -
[TensorRT EP] Simplify TRTEP version update logic
#21061 commented on
Jul 11, 2024 • 0 new comments -
Connecting fp16xq4 gemm kernels (optimized for A100) to MatMulNBits<fp16> operator
#21083 commented on
Jul 10, 2024 • 0 new comments -
Enabling S8S8 and S8U8 handling in QGemm for AVX2 and AVX-VNNI
#21123 commented on
Jul 8, 2024 • 0 new comments -
Bump socket.io from 4.6.1 to 4.7.5 in /js/web
#21126 commented on
Jul 11, 2024 • 0 new comments -
Keep QDQ nodes w/ nonpositive scale around MaxPool
#21182 commented on
Jul 8, 2024 • 0 new comments -
Implementation of set membership
#21222 commented on
Jul 8, 2024 • 0 new comments -
[js/webgpu] Enable conv+clip fuse on mobilenetv2-12-f16
#21234 commented on
Jul 11, 2024 • 0 new comments -
[WIP][JS/WegGPU] Initial changes to support wasm64.
#21260 commented on
Jul 11, 2024 • 0 new comments -
[Bug] Coqui VITS ONNX model can't be statically quantized.
#16738 commented on
Jul 9, 2024 • 0 new comments -
[Feature Request] Move graph compilation behind higher transformers (graph optimization)
#20915 commented on
Jul 9, 2024 • 0 new comments -
[Web] WebGPU and WASM Backends Unavailable within Service Worker
#20876 commented on
Jul 9, 2024 • 0 new comments -
[Build] DLLs in maven build are not digitally signed
#19204 commented on
Jul 8, 2024 • 0 new comments -
Support Numpy v2.0
#21063 commented on
Jul 8, 2024 • 0 new comments -
Using ML.Net and ONNX in Alpine Docker gives library load error.
#8162 commented on
Jul 8, 2024 • 0 new comments -
Stateful/Memory models
#20943 commented on
Jul 8, 2024 • 0 new comments -
Error in quantize vicuna-7b model from fp16 to int8
#20867 commented on
Jul 7, 2024 • 0 new comments -
ONNXruntime version 1.18.0
#20877 commented on
Jul 7, 2024 • 0 new comments -
[Performance] Is my script set to get optimal performance of onnxruntime?
#20945 commented on
Jul 7, 2024 • 0 new comments -
Mac m1 build android.The compiler doesn't support BFLOAT16!!!
#20948 commented on
Jul 7, 2024 • 0 new comments -
How can I debug a reproducible error?
#20792 commented on
Jul 6, 2024 • 0 new comments -
[Documentation] How to run this model on android mobile platform
#20937 commented on
Jul 6, 2024 • 0 new comments -
[Web] cannot load onnx model in a vite/react project, because of error expected magic word 00 61 73 6d, found 3c 21 44 4f @+0
#19556 commented on
Jul 6, 2024 • 0 new comments -
[Build] build python wheel fails
#21145 commented on
Jul 6, 2024 • 0 new comments -
Initialization crash using OnnxRuntime 17.0 (previously working on 16.3)
#21205 commented on
Jul 5, 2024 • 0 new comments -
How to specify fusion rules with quantization?
#21251 commented on
Jul 5, 2024 • 0 new comments -
[C#] Enable copying of GPU OrtValue to CPU
#21244 commented on
Jul 11, 2024 • 0 new comments -
TArray used for broadcast was limited to be within range [0, 8] on onnxruntime 1.16.3
#21254 commented on
Jul 11, 2024 • 0 new comments -
[Performance] How does onnxruntime run in parallel mode?
#21259 commented on
Jul 11, 2024 • 0 new comments -
ROCM EP convolution fails due to missing
#19566 commented on
Jul 11, 2024 • 0 new comments -
[Build] CUDA Illegal Memory Access error when using a custom Triton kernel
#20885 commented on
Jul 11, 2024 • 0 new comments -
XGBoost incremental training, issue with ONNX Conversion
#18841 commented on
Jul 11, 2024 • 0 new comments -
[Build] Build python interface for Onnxruntime-qnn on aarch64 Linux
#21203 commented on
Jul 11, 2024 • 0 new comments -
TensorrtExecutionProvider slower than CUDAExecutionProvider: Faster-rcnn [Performance]
#17434 commented on
Jul 10, 2024 • 0 new comments -
[Build] how to buid on openharmony?
#20895 commented on
Jul 10, 2024 • 0 new comments -
[Mobile][Kotlin] OnnxTensor.createTensor from floatBuffer takes up 7 seconds
#16937 commented on
Jul 10, 2024 • 0 new comments -
[Mobile] Issue with Importing YOLOv8 Pose Model into Unity using Sentis
#21253 commented on
Jul 10, 2024 • 0 new comments -
[Performance] Severe performance penalty with transformer model and DirectML
#20983 commented on
Jul 10, 2024 • 0 new comments -
[Build] Float16_t and BFloat16_t compile error
#20564 commented on
Jul 10, 2024 • 0 new comments -
DML EP takes very long time and not exit compiling
#21255 commented on
Jul 10, 2024 • 0 new comments -
DML cannot use device_id = 1 , run_with_iobinding failed.
#21092 commented on
Jul 10, 2024 • 0 new comments -
NOT_IMPLEMENTED : Could not find an implementation for ReduceProd(18) node with name 'p2o.ReduceProd.0'
#20693 commented on
Jul 9, 2024 • 0 new comments -
Incorrect result for converted FP16 model with Conv Op when run on arm64 Linux with onnxruntime >= 1.15.0
#18992 commented on
Jul 9, 2024 • 0 new comments