|
|
|
@ -1,156 +1,95 @@ |
|
|
|
[[databuffers]] |
|
|
|
[[databuffers]] |
|
|
|
= Data Buffers and Codecs |
|
|
|
= Data Buffers and Codecs |
|
|
|
|
|
|
|
|
|
|
|
The `DataBuffer` interface defines an abstraction over byte buffers. |
|
|
|
Java NIO provides `ByteBuffer` but many libraries build their own byte buffer API on top, |
|
|
|
The main reason for introducing it (and not using the standard `java.nio.ByteBuffer` instead) is Netty. |
|
|
|
especially for network operations where reusing buffers and/or using direct buffers is |
|
|
|
Netty does not use `ByteBuffer` but instead offers `ByteBuf` as an alternative. |
|
|
|
beneficial for performance. For example Netty has the `ByteBuf` hierarchy, Undertow uses |
|
|
|
Spring's `DataBuffer` is a simple abstraction over `ByteBuf` that can also be used on non-Netty |
|
|
|
XNIO, Jetty uses pooled byte buffers with a callback to be released, and so on. |
|
|
|
platforms (that is, Servlet 3.1+). |
|
|
|
The `spring-core` module provides a set of abstractions to work with various byte buffer |
|
|
|
|
|
|
|
APIs as follows: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
* <<databuffers-factory>> abstracts the creation of a data buffer. |
|
|
|
|
|
|
|
* <<databuffers-buffer>> represents a byte buffer, which may be |
|
|
|
|
|
|
|
<<databuffers-buffer-pooled,pooled>>. |
|
|
|
|
|
|
|
* <<databuffers-utils>> offers utility methods for data buffers. |
|
|
|
|
|
|
|
* <<Codecs>> decode or encode streams data buffer streams into higher level objects. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[[databuffers-factory]] |
|
|
|
== `DataBufferFactory` |
|
|
|
== `DataBufferFactory` |
|
|
|
|
|
|
|
|
|
|
|
The `DataBufferFactory` offers functionality to allocate new data buffers as well as to wrap |
|
|
|
`DataBufferFactory` is used to create data buffers in one of two ways: |
|
|
|
existing data. |
|
|
|
|
|
|
|
The `allocateBuffer` methods allocate a new data buffer with a default or given capacity. |
|
|
|
|
|
|
|
Though `DataBuffer` implementations grow and shrink on demand, it is more efficient to give the |
|
|
|
|
|
|
|
capacity upfront, if known. |
|
|
|
|
|
|
|
The `wrap` methods decorate an existing `ByteBuffer` or byte array. |
|
|
|
|
|
|
|
Wrapping does not involve allocation. It decorates the given data with a `DataBuffer` |
|
|
|
|
|
|
|
implementation. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
There are two implementation of `DataBufferFactory`: the `NettyDataBufferFactory` |
|
|
|
. Allocate a new data buffer, optionally specifying capacity upfront, if known, which is |
|
|
|
(for Netty platforms, such as Reactor Netty) and `DefaultDataBufferFactory` |
|
|
|
more efficient even though implementations of `DataBuffer` can grow and shrink on demand. |
|
|
|
(for other platforms, such as Servlet 3.1+ servers). |
|
|
|
. Wrap an existing `byte[]` or `java.nio.ByteBuffer`, which decorates the given data with |
|
|
|
|
|
|
|
a `DataBuffer` implementation and that does not involve allocation. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that WebFlux applications do not create a `DataBufferFactory` directly but instead |
|
|
|
|
|
|
|
access it through the `ServerHttpResponse` or the `ClientHttpRequest` on the client side. |
|
|
|
|
|
|
|
The type of factory depends on the underlying client or server, e.g. |
|
|
|
|
|
|
|
`NettyDataBufferFactory` for Reactor Netty, `DefaultDataBufferFactory` for others. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
== The `DataBuffer` Interface |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The `DataBuffer` interface is similar to `ByteBuffer` but offers a number of advantages. |
|
|
|
[[databuffers-buffer]] |
|
|
|
Similar to Netty's `ByteBuf`, the `DataBuffer` abstraction offers independent read and write |
|
|
|
== `DataBuffer` |
|
|
|
positions. |
|
|
|
|
|
|
|
This is different from the JDK's `ByteBuffer`, which exposes only one position for both reading and |
|
|
|
|
|
|
|
writing and a separate `flip()` operation to switch between the two I/O operations. |
|
|
|
|
|
|
|
In general, the following invariant holds for the read position, write position, and the capacity: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
==== |
|
|
|
The `DataBuffer` interface offers similar operations as `java.nio.ByteBuffer` but also |
|
|
|
[literal] |
|
|
|
brings a few additional benefits some of which are inspired by the Netty `ByteBuf`. |
|
|
|
[subs="verbatim,quotes"] |
|
|
|
Below is a partial list of benefits: |
|
|
|
-- |
|
|
|
|
|
|
|
0 <= read position <= write position <= capacity |
|
|
|
|
|
|
|
-- |
|
|
|
|
|
|
|
==== |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
When reading bytes from the `DataBuffer`, the read position is automatically updated in accordance with |
|
|
|
* Read and write with independent positions, i.e. not requiring a call to `flip()` to |
|
|
|
the amount of data read from the buffer. |
|
|
|
alternate between read and write. |
|
|
|
Similarly, when writing bytes to the `DataBuffer`, the write position is updated with the amount of |
|
|
|
* Capacity expanded on demand as with `java.lang.StringBuilder`. |
|
|
|
data written to the buffer. |
|
|
|
* Pooled buffers and reference counting via <<databuffers-buffer-pooled>>. |
|
|
|
Also, when writing data, the capacity of a `DataBuffer` is automatically expanded, in the same fashion as `StringBuilder`, |
|
|
|
* View a buffer as `java.nio.ByteBuffer`, `InputStream`, or `OutputStream`. |
|
|
|
`ArrayList`, and similar types. |
|
|
|
* Determine the index, or the last index, for a given byte. |
|
|
|
|
|
|
|
|
|
|
|
Besides the reading and writing functionality mentioned above, the `DataBuffer` also has methods to |
|
|
|
|
|
|
|
view a (slice of a) buffer as a `ByteBuffer`, an `InputStream`, or an `OutputStream`. |
|
|
|
|
|
|
|
Additionally, it offers methods to determine the index of a given byte. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
As mentioned earlier, there are two implementation of `DataBufferFactory`: the `NettyDataBufferFactory` |
|
|
|
|
|
|
|
(for Netty platforms, such as Reactor Netty) and |
|
|
|
|
|
|
|
`DefaultDataBufferFactory` (for other platforms, such as |
|
|
|
|
|
|
|
Servlet 3.1+ servers). |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
=== `PooledDataBuffer` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The `PooledDataBuffer` is an extension to `DataBuffer` that adds methods for reference counting. |
|
|
|
|
|
|
|
The `retain` method increases the reference count by one. |
|
|
|
|
|
|
|
The `release` method decreases the count by one and releases the buffer's memory when the count |
|
|
|
|
|
|
|
reaches 0. |
|
|
|
|
|
|
|
Both of these methods are related to reference counting, a mechanism that we explain <<databuffer-reference-counting,later>>. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that `DataBufferUtils` offers useful utility methods for releasing and retaining pooled data |
|
|
|
|
|
|
|
buffers. |
|
|
|
|
|
|
|
These methods take a plain `DataBuffer` as a parameter but only call `retain` or `release` if the |
|
|
|
|
|
|
|
passed data buffer is an instance of `PooledDataBuffer`. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[[databuffer-reference-counting]] |
|
|
|
|
|
|
|
==== Reference Counting |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Reference counting is not a common technique in Java. It is much more common in other programming |
|
|
|
|
|
|
|
languages, such as Object C and C++. |
|
|
|
|
|
|
|
In and of itself, reference counting is not complex. It basically involves tracking the number of |
|
|
|
|
|
|
|
references that apply to an object. |
|
|
|
|
|
|
|
The reference count of a `PooledDataBuffer` starts at 1, is incremented by calling `retain`, |
|
|
|
|
|
|
|
and is decremented by calling `release`. |
|
|
|
|
|
|
|
As long as the buffer's reference count is larger than 0, the buffer is not released. |
|
|
|
|
|
|
|
When the number decreases to 0, the instance is released. |
|
|
|
|
|
|
|
In practice, this means that the reserved memory captured by the buffer is returned back to |
|
|
|
|
|
|
|
the memory pool, ready to be used for future allocations. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
In general, the last component to access a `DataBuffer` is responsible for releasing it. |
|
|
|
|
|
|
|
Within Spring, there are two sorts of components that release buffers: decoders and transports. |
|
|
|
|
|
|
|
Decoders are responsible for transforming a stream of buffers into other types (see <<codecs>>), |
|
|
|
|
|
|
|
and transports are responsible for sending buffers across a network boundary, typically as an HTTP message. |
|
|
|
|
|
|
|
This means that, if you allocate data buffers for the purpose of putting them into an outbound HTTP |
|
|
|
|
|
|
|
message (that is, a client-side request or server-side response), they do not have to be released. |
|
|
|
|
|
|
|
The other consequence of this rule is that if you allocate data buffers that do not end up in the |
|
|
|
|
|
|
|
body (for instance, because of a thrown exception), you have to release them yourself. |
|
|
|
|
|
|
|
The following snippet shows a typical `DataBuffer` usage scenario when dealing with methods that |
|
|
|
|
|
|
|
throw exceptions: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
==== |
|
|
|
|
|
|
|
[source,java,indent=0] |
|
|
|
|
|
|
|
[subs="verbatim,quotes"] |
|
|
|
|
|
|
|
---- |
|
|
|
|
|
|
|
DataBufferFactory factory = ... |
|
|
|
|
|
|
|
DataBuffer buffer = factory.allocateBuffer(); <1> |
|
|
|
|
|
|
|
boolean release = true; <2> |
|
|
|
|
|
|
|
try { |
|
|
|
|
|
|
|
writeDataToBuffer(buffer); <3> |
|
|
|
|
|
|
|
putBufferInHttpBody(buffer); |
|
|
|
|
|
|
|
release = false; <4> |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
finally { |
|
|
|
|
|
|
|
if (release) { |
|
|
|
|
|
|
|
DataBufferUtils.release(buffer); <5> |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
private void writeDataToBuffer(DataBuffer buffer) throws IOException { <3> |
|
|
|
|
|
|
|
... |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
---- |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
<1> A new buffer is allocated. |
|
|
|
|
|
|
|
<2> A boolean flag indicates whether the allocated buffer should be released. |
|
|
|
|
|
|
|
<3> This example method loads data into the buffer. Note that the method can throw an `IOException`. |
|
|
|
|
|
|
|
Therefore, a `finally` block to release the buffer is required. |
|
|
|
|
|
|
|
<4> If no exception occurred, we switch the `release` flag to `false` as the buffer is now |
|
|
|
|
|
|
|
released as part of sending the HTTP body across the wire. |
|
|
|
|
|
|
|
<5> If an exception did occur, the flag is still set to `true`, and the buffer is released |
|
|
|
|
|
|
|
here. |
|
|
|
|
|
|
|
==== |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
[[databuffers-buffer-pooled]] |
|
|
|
|
|
|
|
== `PooledDataBuffer` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
As explained in the Javadoc for |
|
|
|
|
|
|
|
https://docs.oracle.com/javase/8/docs/api/java/nio/ByteBuffer.html[ByteBuffer], |
|
|
|
|
|
|
|
byte buffers can be direct or non-direct. Direct buffers may reside outside the Java heap |
|
|
|
|
|
|
|
which eliminates the need for copying for native I/O operations. That makes direct buffers |
|
|
|
|
|
|
|
particularly useful for receiving and sending data over a socket, but they're also more |
|
|
|
|
|
|
|
expensive to create and release, which leads to the idea of pooling buffers. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
`PooledDataBuffer` is an extension of `DataBuffer` that helps with reference counting which |
|
|
|
|
|
|
|
is essential for byte buffer pooling. How does it work? When a `PooledDataBuffer` is |
|
|
|
|
|
|
|
allocated the reference count is at 1. Calls to `retain()` increment the count, while |
|
|
|
|
|
|
|
calls to `release()` decrement it. As long as the count is above 0, the buffer is |
|
|
|
|
|
|
|
guaranteed not to be released. When the count is decreased to 0, the pooled buffer can be |
|
|
|
|
|
|
|
released, which in practice could mean the reserved memory for the buffer is returned to |
|
|
|
|
|
|
|
the memory pool. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that instead of operating on `PooledDataBuffer` directly, in most cases it's better |
|
|
|
|
|
|
|
to use the convenience methods in `DataBufferUtils` that apply release or retain to a |
|
|
|
|
|
|
|
`DataBuffer` only if it is an instance of `PooledDataBuffer`. |
|
|
|
|
|
|
|
|
|
|
|
=== `DataBufferUtils` |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The `DataBufferUtils` class contains various utility methods that operate on data buffers. |
|
|
|
|
|
|
|
It contains methods for reading a `Flux` of `DataBuffer` objects from an `InputStream` or NIO |
|
|
|
|
|
|
|
`Channel` and methods for writing a data buffer `Flux` to an `OutputStream` or `Channel`. |
|
|
|
|
|
|
|
`DataBufferUtils` also exposes `retain` and `release` methods that operate on plain `DataBuffer` |
|
|
|
|
|
|
|
instances (so that casting to a `PooledDataBuffer` is not required). |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Additionally, `DataBufferUtils` exposes `compose`, which merges a stream of data buffers into one. |
|
|
|
|
|
|
|
For instance, this method can be used to convert the entire HTTP body into a single buffer (and |
|
|
|
[[databuffers-utils]] |
|
|
|
from that, a `String` or `InputStream`). |
|
|
|
== `DataBufferUtils` |
|
|
|
This is particularly useful when dealing with older, blocking APIs. |
|
|
|
|
|
|
|
Note, however, that this puts the entire body in memory, and therefore uses more memory than a pure |
|
|
|
`DataBufferUtils` offers a number of utility methods to operate on data buffers: |
|
|
|
streaming solution would. |
|
|
|
|
|
|
|
|
|
|
|
* Join a stream of data buffers into a single buffer possibly with zero copy, e.g. via |
|
|
|
|
|
|
|
composite buffers, if that's supported by the underlying byte buffer API. |
|
|
|
|
|
|
|
* Turn `InputStream` or NIO `Channel` into `Flux<DataBuffer>`, and vice versa a |
|
|
|
|
|
|
|
`Publisher<DataBuffer>` into `OutputStream` or NIO `Channel`. |
|
|
|
|
|
|
|
* Methods to release or retain a `DataBuffer` if the buffer is an instance of |
|
|
|
|
|
|
|
`PooledDataBuffer`. |
|
|
|
|
|
|
|
* Skip or take from a stream of bytes until a specific byte count. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@ -158,19 +97,73 @@ streaming solution would. |
|
|
|
[[codecs]] |
|
|
|
[[codecs]] |
|
|
|
== Codecs |
|
|
|
== Codecs |
|
|
|
|
|
|
|
|
|
|
|
The `org.springframework.core.codec` package contains the two main abstractions for converting a |
|
|
|
The `org.springframework.core.codec` package provides the following stragy interfaces: |
|
|
|
stream of bytes into a stream of objects or vice-versa. |
|
|
|
|
|
|
|
The `Encoder` is a strategy interface that encodes a stream of objects into an output stream of |
|
|
|
* `Encoder` to encode `Publisher<T>` into a stream of data buffers. |
|
|
|
data buffers. |
|
|
|
* `Decoder` to decode `Publisher<DataBuffer>` into a stream of higher level objects. |
|
|
|
The `Decoder` does the reverse: It turns a stream of data buffers into a stream of objects. |
|
|
|
|
|
|
|
Note that a decoder instance needs to consider <<databuffer-reference-counting,reference counting>>. |
|
|
|
The `spring-core` module provides `byte[]`, `ByteBuffer`, `DataBuffer`, `Resource`, and |
|
|
|
|
|
|
|
`String` encoder and decoder implementations. The `spring-web` module adds Jackson JSON, |
|
|
|
Spring comes with a wide array of default codecs (to convert from and to `String`, |
|
|
|
Jackson Smile, JAXB2, Protocol Buffers and other encoders and decoders. See |
|
|
|
`ByteBuffer`, and byte arrays) and codecs that support marshalling libraries such as JAXB and |
|
|
|
<<web-reactive.adoc#webflux-codecs,Codecs>> in the WebFlux section. |
|
|
|
Jackson (with https://github.com/FasterXML/jackson-core/issues/57[Jackson 2.9+ support for non-blocking parsing]). |
|
|
|
|
|
|
|
Within the context of Spring WebFlux, codecs are used to convert the request body into a |
|
|
|
|
|
|
|
`@RequestMapping` parameter or to convert the return type into the response body that is sent back |
|
|
|
|
|
|
|
to the client. |
|
|
|
|
|
|
|
The default codecs are configured in the `WebFluxConfigurationSupport` class. You can |
|
|
|
[[databuffers-using]] |
|
|
|
change them by overriding the `configureHttpMessageCodecs` when you inherit from that class. |
|
|
|
== Using `DataBuffer` |
|
|
|
For more information about using codecs in WebFlux, see <<web-reactive#webflux-codecs>>. |
|
|
|
|
|
|
|
|
|
|
|
When working with data buffers, special care must be taken to ensure buffers are released |
|
|
|
|
|
|
|
since they may be <<databuffers-buffer-pooled,pooled>>. We'll use codecs to illustrate |
|
|
|
|
|
|
|
how that works but the concepts apply more generally. Let's see what codecs must do |
|
|
|
|
|
|
|
internally to manage data buffers. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
A `Decoder` is the last to read input data buffers, before creating higher level |
|
|
|
|
|
|
|
objects, and therefore it must release them as follows: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
. If a `Decoder` simply reads each input buffer and is ready to |
|
|
|
|
|
|
|
release it immediately, it can do so via `DataBufferUtils.release(dataBuffer)`. |
|
|
|
|
|
|
|
. If a `Decoder` is using `Flux` or `Mono` operators such as `flatMap`, `reduce`, and |
|
|
|
|
|
|
|
others that prefetch and cache data items internally, or is using operators such as |
|
|
|
|
|
|
|
`filter`, `skip`, and others that leave out items, then |
|
|
|
|
|
|
|
`doOnDiscard(PooledDataBuffer.class, DataBufferUtils::release)` must be added to the |
|
|
|
|
|
|
|
composition chain to ensure such buffers are released prior to being discarded, possibly |
|
|
|
|
|
|
|
also as a result an error or cancellation signal. |
|
|
|
|
|
|
|
. If a `Decoder` holds on to one or more data buffers in any other way, it must |
|
|
|
|
|
|
|
ensure they are released when fully read, or in case an error or cancellation signals that |
|
|
|
|
|
|
|
take place before the cached data buffers have been read and released. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that `DataBufferUtils#join` offers a safe and efficient way to aggregate a data |
|
|
|
|
|
|
|
buffer stream into a single data buffer. Likewise `skipUntilByteCount` and |
|
|
|
|
|
|
|
`takeUntilByteCount` are additional safe methods for decoders to use. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
An `Encoder` allocates data buffers that others must read (and release). So an `Encoder` |
|
|
|
|
|
|
|
doesn't have much to do. However an `Encoder` must take care to release a data buffer if |
|
|
|
|
|
|
|
a serialization error occurs while populating the buffer with data. For example: |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
==== |
|
|
|
|
|
|
|
[source,java,indent=0] |
|
|
|
|
|
|
|
[subs="verbatim,quotes"] |
|
|
|
|
|
|
|
---- |
|
|
|
|
|
|
|
DataBuffer buffer = factory.allocateBuffer(); |
|
|
|
|
|
|
|
boolean release = true; |
|
|
|
|
|
|
|
try { |
|
|
|
|
|
|
|
// serialize and populate buffer.. |
|
|
|
|
|
|
|
release = false; |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
finally { |
|
|
|
|
|
|
|
if (release) { |
|
|
|
|
|
|
|
DataBufferUtils.release(buffer); |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
} |
|
|
|
|
|
|
|
return buffer; |
|
|
|
|
|
|
|
---- |
|
|
|
|
|
|
|
==== |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
The consumer of an `Encoder` is responsible for releasing the data buffers it receives. |
|
|
|
|
|
|
|
In a WebFlux application, the output of the `Encoder` is used to write to the HTTP server |
|
|
|
|
|
|
|
response, or to the client HTTP request, in which case releasing the data buffers is the |
|
|
|
|
|
|
|
responsibility of the code writing to the server response, or to the client request. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Note that when running on Netty, there are debugging options for |
|
|
|
|
|
|
|
https://github.com/netty/netty/wiki/Reference-counted-objects#troubleshooting-buffer-leaks[troubleshooting buffer leaks]. |
|
|
|
|