Changelog
Source:NEWS.md
pjrt (development version)
Bug fixes
-
check_err()no longer leaks the underlyingPJRT_Errorwhen converting a plugin error into an R exception. - Reading a buffer back to the host now respects the device buffer’s actual memory layout. A non-row-major (but untiled) executable output — e.g. one pinned to a column-major layout via
mhlo.layout_mode— is reordered correctly instead of being returned transposed. A layout the readback cannot faithfully reorder (strided, tiled, or rank-mismatched) now raises a clear error rather than silently returning wrong data.
Features
-
pjrt_buffer(),pjrt_scalar(), andpjrt_execute()now call R’s garbage collector and retry once when the plugin reportsRESOURCE_EXHAUSTED. UnreferencedPJRTBufferexternal pointers are finalized between attempts so their device memory is released before the retry. - The first time a PJRT plugin needs to be downloaded, interactive sessions now ask for confirmation before downloading (similar to
torch). Non-interactive sessions no longer download automatically. ThePJRT_INSTALLenvironment variable overrides this: set it to"1"to always download without asking, or"0"to never download.
pjrt 0.4.0
Features
- Added QR, LU, SVD, and symmetric eigendecomposition support on both CPU and CUDA via the FFI registration mechanism.
- Added an vignette on how to register custom calls via the FFI registration mechanisms with coverage of both CUDA and CPU-specific aspects.
- Added support for the
bit64package to better support long integers. -
pjrt_buffer(),pjrt_scalar(), andas_array()gain acheckargument (defaultFALSE). WhenTRUE, the call errors instead of silently losing information: on input ifdatacontainsNAs, on output if the materialized R vector contains a value that’s indistinguishable fromNAor that has wrapped through the integer container. -
as_array()on aui32buffer now returns abit64::integer64instead of a baseinteger, so values>= 2^31round-trip losslessly rather than wrapping to negative.
pjrt 0.3.0
Features
- Added
buffer_copy()function to copy buffer between devices. - New
pjrt_register_custom_call()allows external packages to register C/C++ XLA FFI handlers with the PJRT plugin. Registration is deferred until the plugin loads, so handlers can be registered during.onLoad(). -
pjrt_device()now returns cachedPJRTDeviceinstances, so repeated calls for the same device yield objects with stable identity (useful for hashing and caching, e.g. in{anvil}’s JIT).
Bug fixes
- The configure script now uses the
protoccompiler from the same installation as the linked protobuf library, preventing version mismatches when multiple protobuf versions are installed. - Compiling a program for a specific CPU device (e.g.
cpu:1) now targets that device instead of silently falling back tocpu:0. - Fixed device targeting when compiling against a distributed PJRT client, where global device IDs and local hardware ordinals diverge.
pjrt 0.2.0
Asynchronous API
Operations such as host <-> device transfers and program execution were previously only synchronous. Now, they are asynchronous which has considerable performance benefits, especially on GPU. Specifically:
-
pjrt_buffer()andpjrt_execute()return immediately, but the returned buffer is not necessarily ready. To await a transfer or computation of a buffer, useawait(). However, this is handled within PJRT, so this function never has to be called by a user. -
as_array()is still synchronous, but there is now the asynchronous versionas_array_async()but this is rarely needed. If used, it returns aPJRTArrayPromiseobject which can be converted to an Rarray/vectorviavalue(). - To check whether a
PJRTBufferorPJRTArrayPromiseis ready, useis_ready().
Features
- Added
dtypesupport forPJRTBuffers via thetengen::dtypeS3 generic."bool"is now accepted as an alias for"i1"/"pred". - Accept
DataTypeobjects in thedtypeparameter ofpjrt_buffer(). - Support
deviceargument inpjrt_compile().
Bug fixes
- Protect from segfaults in raw to buffer conversion.
- Protect from segfault during device mismatch in
pjrt_execute().