Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -577,7 +577,7 @@ private Optional<Node> maybeNodeForPosition(TopicPartition partition,
* </p>
*
* <p>
* Here's why this is importantin a production system, a given leader node serves as a leader for many partitions.
* Here's why this is important-in a production system, a given leader node serves as a leader for many partitions.
* From the client's perspective, it's possible that a node has a mix of both fetchable and unfetchable partitions.
* When the client determines which nodes to skip and which to fetch from, it's important that unfetchable
* partitions don't block fetchable partitions from being fetched.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2189,7 +2189,7 @@ private void subscribeInternal(Collection<String> topics, Optional<ConsumerRebal
}

/**
* Process the eventsif anythat were produced by the {@link ConsumerNetworkThread network thread}.
* Process the events-if any-that were produced by the {@link ConsumerNetworkThread network thread}.
* It is possible that {@link ErrorEvent an error}
* could occur when processing the events. In such cases, the processor will take a reference to the first
* error, continue to process the remaining events, and then throw the first error that occurred.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -242,7 +242,7 @@ void runOnce() {
}

/**
* Process the eventsif anythat were produced by the application thread.
* Process the events-if any-that were produced by the application thread.
*/
private void processApplicationEvents() {
LinkedList<ApplicationEvent> events = new LinkedList<>();
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -248,17 +248,17 @@ public class KafkaProducer<K, V> implements Producer<K, V> {
public static final String NETWORK_THREAD_PREFIX = "kafka-producer-network-thread";
public static final String PRODUCER_METRIC_GROUP_NAME = "producer-metrics";

private static final String INIT_TXN_TIMEOUT_MSG = "InitTransactions timed out " +
private static final String INIT_TXN_TIMEOUT_MSG = "InitTransactions timed out - " +
"did not complete coordinator discovery or " +
"receive the InitProducerId response within max.block.ms.";

private static final String SEND_OFFSETS_TIMEOUT_MSG =
"SendOffsetsToTransaction timed out did not reach the coordinator or " +
"SendOffsetsToTransaction timed out - did not reach the coordinator or " +
"receive the TxnOffsetCommit/AddOffsetsToTxn response within max.block.ms";
private static final String COMMIT_TXN_TIMEOUT_MSG =
"CommitTransaction timed out did not complete EndTxn with the transaction coordinator within max.block.ms";
"CommitTransaction timed out - did not complete EndTxn with the transaction coordinator within max.block.ms";
private static final String ABORT_TXN_TIMEOUT_MSG =
"AbortTransaction timed out did not complete EndTxn(abort) with the transaction coordinator within max.block.ms";
"AbortTransaction timed out - did not complete EndTxn(abort) with the transaction coordinator within max.block.ms";

private final String clientId;
// Visible for testing
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -176,7 +176,7 @@
public class KafkaProducerTest {

private static final String INIT_TXN_TIMEOUT_MSG =
"InitTransactions timed out " +
"InitTransactions timed out - " +
"did not complete coordinator discovery or " +
"receive the InitProducerId response within max.block.ms.";

Expand Down
4 changes: 2 additions & 2 deletions docs/introduction.html
Original file line number Diff line number Diff line change
Expand Up @@ -147,15 +147,15 @@ <h4 class="anchor-heading">
<strong>Producers</strong> are those client applications that publish (write) events to Kafka, and <strong>consumers</strong> are those that subscribe to (read and process) these events. In Kafka, producers and consumers are fully decoupled and agnostic of each other, which is a key design element to achieve the high scalability that Kafka is known for. For example, producers never need to wait for consumers. Kafka provides various <a href="/documentation/#semantics">guarantees</a> such as the ability to process events exactly-once.
</p>
<p>
Events are organized and durably stored in <strong>topics</strong>. Very simplified, a topic is similar to a folder in a filesystem, and the events are the files in that folder. An example topic name could be "payments". Topics in Kafka are always multi-producer and multi-subscriber: a topic can have zero, one, or many producers that write events to it, as well as zero, one, or many consumers that subscribe to these events. Events in a topic can be read as often as neededunlike traditional messaging systems, events are not deleted after consumption. Instead, you define for how long Kafka should retain your events through a per-topic configuration setting, after which old events will be discarded. Kafka's performance is effectively constant with respect to data size, so storing data for a long time is perfectly fine.
Events are organized and durably stored in <strong>topics</strong>. Very simplified, a topic is similar to a folder in a filesystem, and the events are the files in that folder. An example topic name could be "payments". Topics in Kafka are always multi-producer and multi-subscriber: a topic can have zero, one, or many producers that write events to it, as well as zero, one, or many consumers that subscribe to these events. Events in a topic can be read as often as needed-unlike traditional messaging systems, events are not deleted after consumption. Instead, you define for how long Kafka should retain your events through a per-topic configuration setting, after which old events will be discarded. Kafka's performance is effectively constant with respect to data size, so storing data for a long time is perfectly fine.
</p>
<p>
Topics are <strong>partitioned</strong>, meaning a topic is spread over a number of "buckets" located on different Kafka brokers. This distributed placement of your data is very important for scalability because it allows client applications to both read and write the data from/to many brokers at the same time. When a new event is published to a topic, it is actually appended to one of the topic's partitions. Events with the same event key (e.g., a customer or vehicle ID) are written to the same partition, and Kafka <a href="/documentation/#semantics">guarantees</a> that any consumer of a given topic-partition will always read that partition's events in exactly the same order as they were written.
</p>
<figure class="figure">
<img src="/images/streams-and-tables-p1_p4.png" class="figure-image" />
<figcaption class="figure-caption">
Figure: This example topic has four partitions P1P4. Two different producer clients are publishing,
Figure: This example topic has four partitions P1-P4. Two different producer clients are publishing,
independently from each other, new events to the topic by writing events over the network to the topic's
partitions. Events with the same key (denoted by their color in the figure) are written to the same
partition. Note that both producers can write to the same partition if appropriate.
Expand Down
8 changes: 4 additions & 4 deletions docs/ops.html
Original file line number Diff line number Diff line change
Expand Up @@ -904,7 +904,7 @@ <h5 class="anchor-heading"><a id="georeplication-flow-secure" class="anchor-link
<h5 class="anchor-heading"><a id="georeplication-topic-naming" class="anchor-link"></a><a href="#georeplication-topic-naming">Custom Naming of Replicated Topics in Target Clusters</a></h5>

<p>
Replicated topics in a target clustersometimes called <em>remote</em> topicsare renamed according to a replication policy. MirrorMaker uses this policy to ensure that events (aka records, messages) from different clusters are not written to the same topic-partition. By default as per <a href="https://github.com/apache/kafka/blob/trunk/connect/mirror-client/src/main/java/org/apache/kafka/connect/mirror/DefaultReplicationPolicy.java">DefaultReplicationPolicy</a>, the names of replicated topics in the target clusters have the format <code>{source}.{source_topic_name}</code>:
Replicated topics in a target cluster-sometimes called <em>remote</em> topics-are renamed according to a replication policy. MirrorMaker uses this policy to ensure that events (aka records, messages) from different clusters are not written to the same topic-partition. By default as per <a href="https://github.com/apache/kafka/blob/trunk/connect/mirror-client/src/main/java/org/apache/kafka/connect/mirror/DefaultReplicationPolicy.java">DefaultReplicationPolicy</a>, the names of replicated topics in the target clusters have the format <code>{source}.{source_topic_name}</code>:
</p>

<pre><code class="language-text">us-west us-east
Expand Down Expand Up @@ -1260,7 +1260,7 @@ <h4 class="anchor-heading"><a id="multitenancy-security" class="anchor-link"></a
</p>

<p>
In the following example, user Alicea new member of ACME corporation's InfoSec teamis granted write permissions to all topics whose names start with "acme.infosec.", such as "acme.infosec.telemetry.logins" and "acme.infosec.syslogs.events".
In the following example, user Alice-a new member of ACME corporation's InfoSec team-is granted write permissions to all topics whose names start with "acme.infosec.", such as "acme.infosec.telemetry.logins" and "acme.infosec.syslogs.events".
</p>

<pre><code class="language-bash"># Grant permissions to user Alice
Expand All @@ -1277,11 +1277,11 @@ <h4 class="anchor-heading"><a id="multitenancy-security" class="anchor-link"></a
<h4 class="anchor-heading"><a id="multitenancy-isolation" class="anchor-link"></a><a href="#multitenancy-isolation">Isolating Tenants: Quotas, Rate Limiting, Throttling</a></h4>

<p>
Multi-tenant clusters should generally be configured with <a href="#design_quotas">quotas</a>, which protect against users (tenants) eating up too many cluster resources, such as when they attempt to write or read very high volumes of data, or create requests to brokers at an excessively high rate. This may cause network saturation, monopolize broker resources, and impact other clientsall of which you want to avoid in a shared environment.
Multi-tenant clusters should generally be configured with <a href="#design_quotas">quotas</a>, which protect against users (tenants) eating up too many cluster resources, such as when they attempt to write or read very high volumes of data, or create requests to brokers at an excessively high rate. This may cause network saturation, monopolize broker resources, and impact other clients-all of which you want to avoid in a shared environment.
</p>

<p>
<strong>Client quotas:</strong> Kafka supports different types of (per-user principal) client quotas. Because a client's quotas apply irrespective of which topics the client is writing to or reading from, they are a convenient and effective tool to allocate resources in a multi-tenant cluster. <a href="#design_quotascpu">Request rate quotas</a>, for example, help to limit a user's impact on broker CPU usage by limiting the time a broker spends on the <a href="/protocol.html">request handling path</a> for that user, after which throttling kicks in. In many situations, isolating users with request rate quotas has a bigger impact in multi-tenant clusters than setting incoming/outgoing network bandwidth quotas, because excessive broker CPU usage for processing requests reduces the effective bandwidth the broker can serve. Furthermore, administrators can also define quotas on topic operations—such as create, delete, and alter—to prevent Kafka clusters from being overwhelmed by highly concurrent topic operations (see <a href="https://cwiki.apache.org/confluence/x/6DLcC">KIP-599</a> and the quota type <code>controller_mutation_rate</code>).
<strong>Client quotas:</strong> Kafka supports different types of (per-user principal) client quotas. Because a client's quotas apply irrespective of which topics the client is writing to or reading from, they are a convenient and effective tool to allocate resources in a multi-tenant cluster. <a href="#design_quotascpu">Request rate quotas</a>, for example, help to limit a user's impact on broker CPU usage by limiting the time a broker spends on the <a href="/protocol.html">request handling path</a> for that user, after which throttling kicks in. In many situations, isolating users with request rate quotas has a bigger impact in multi-tenant clusters than setting incoming/outgoing network bandwidth quotas, because excessive broker CPU usage for processing requests reduces the effective bandwidth the broker can serve. Furthermore, administrators can also define quotas on topic operations-such as create, delete, and alter-to prevent Kafka clusters from being overwhelmed by highly concurrent topic operations (see <a href="https://cwiki.apache.org/confluence/x/6DLcC">KIP-599</a> and the quota type <code>controller_mutation_rate</code>).
</p>

<p>
Expand Down
4 changes: 2 additions & 2 deletions docs/quickstart.html
Original file line number Diff line number Diff line change
Expand Up @@ -124,7 +124,7 @@ <h4 class="anchor-heading">
<p>
A Kafka client communicates with the Kafka brokers via the network for writing (or reading) events.
Once received, the brokers will store the events in a durable and fault-tolerant manner for as long as you
needeven forever.
need-even forever.
</p>

<p>
Expand Down Expand Up @@ -293,7 +293,7 @@ <h4 class="anchor-heading">
</h4>

<p>
Now that you reached the end of the quickstart, feel free to tear down the Kafka environmentor
Now that you reached the end of the quickstart, feel free to tear down the Kafka environment-or
continue playing around.
</p>

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,7 +93,7 @@ public static <K, V> Repartitioned<K, V> with(final Serde<K> keySerde,
*
* @param partitioner the function used to determine how records are distributed among partitions of the topic,
* if not specified and the key serde provides a {@link WindowedSerializer} for the key
* {@link WindowedStreamPartitioner} will be usedotherwise {@link DefaultStreamPartitioner} will be used
* {@link WindowedStreamPartitioner} will be used-otherwise {@link DefaultStreamPartitioner} will be used
* @param <K> key type
* @param <V> value type
* @return A new {@code Repartitioned} instance configured with partitioner
Expand Down Expand Up @@ -162,7 +162,7 @@ public Repartitioned<K, V> withValueSerde(final Serde<V> valueSerde) {
*
* @param partitioner the function used to determine how records are distributed among partitions of the topic,
* if not specified and the key serde provides a {@link WindowedSerializer} for the key
* {@link WindowedStreamPartitioner} will be usedotherwise {@link DefaultStreamPartitioner} will be used
* {@link WindowedStreamPartitioner} will be used-otherwise {@link DefaultStreamPartitioner} will be used
* @return a new {@code Repartitioned} instance configured with provided partitioner
*/
public Repartitioned<K, V> withStreamPartitioner(final StreamPartitioner<K, V> partitioner) {
Expand Down
Loading