search
HomeJavajavaTutorialBelay the Metamorphosis: analyzing Kafka project

您有没有想过跨国公司的项目源代码中可能潜藏着哪些错误?不要错过在开源 Apache Kafka 项目中发现 PVS-Studio 静态分析器检测到的有趣错误的机会。

Belay the Metamorphosis: analyzing Kafka project

介绍

Apache Kafka 是一个著名的开源项目,主要用 Java 编写。 LinkedIn 于 2011 年将其开发为消息代理,即各种系统组件的数据管道。如今,它已成为同类产品中最受欢迎的解决方案之一。

准备好看看引擎盖下的内容了吗?

附注
只是想简单说明一下标题。它参考了弗朗茨·卡夫卡的《变形记》,其中主角变成了可怕的害虫。我们的静态分析器致力于防止您的项目变身为可怕的害虫转变为一个巨大的错误,所以对“变形记”说不。

哦不,虫子

所有的幽默都源于痛苦

这不是我的话;这句话出自理查德·普赖尔之口。但这有什么关系呢?我想告诉你的第一件事是一个愚蠢的错误。然而,在多次尝试理解程序无法正常运行的原因后,遇到如下示例的情况令人沮丧:

@Override
public KeyValueIterator<windowed>, V> backwardFetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  if (keyFrom == null && keyFrom == null) {   // 



<p>如您所见,这是任何开发人员都无法避免的事情——一个微不足道的拼写错误。在第一个条件下,开发人员希望使用以下逻辑表达式:<br>
</p>

<pre class="brush:php;toolbar:false">keyFrom == null && keyTo == null

分析器发出两个警告:

V6001 在“&&”运算符的左侧和右侧有相同的子表达式“keyFrom == null”。 ReadOnlyWindowStoreStub.java 327、ReadOnlyWindowStoreStub.java 327

V6007 表达式“keyFrom == null”始终为 false。 ReadOnlyWindowStoreStub.java 329

我们可以明白为什么。对于每个开发人员来说,这种可笑的打字错误都是永恒的。虽然我们可以花很多时间寻找它们,但要回忆起它们潜伏的地方可不是小菜一碟。

在同一个类中,另一个方法中存在完全相同的错误。我认为称其为复制面食是公平的。

@Override
public KeyValueIterator<windowed>, V> fetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  NavigableMap<k v> kvMap = data.get(now);
  if (kvMap != null) {
    NavigableMap<k v> kvSubMap;
    if (keyFrom == null && keyFrom == null) {      // 



<p>以下是相同的警告:</p>

<p>V6007 表达式“keyFrom == null”始终为 false。 ReadOnlyWindowStoreStub.java 273 </p>

<p>V6001 在“&&”运算符的左侧和右侧有相同的子表达式“keyFrom == null”。 ReadOnlyWindowStoreStub.java 271, ReadOnlyWindowStoreStub.java 271</p>

<p>不用担心——我们不必一次查看数百行代码。 PVS-Studio 非常擅长处理此类简单的事情。解决一些更具挑战性的事情怎么样?</p>

<h3>
  
  
  可变同步
</h3>

<p>Java 中 <em>synchronized</em> 关键字的用途是什么?在这里,我将只关注同步方法,而不是块。根据 Oracle 文档,<em>synchronized</em> 关键字将方法声明为同步,以确保与实例的线程安全交互。如果一个线程调用该实例的同步方法,则尝试调用同一实例的同步方法的其他线程将被阻塞(即它们的执行将被挂起)。它们将被阻塞,直到第一个线程调用的方法处理其执行。当实例对多个线程可见时,需要执行此操作。此类实例的读/写操作只能通过同步方法执行。 </p>

<p>开发人员违反了 <em>Sensor</em> 类中的规则,如下面的简化代码片段所示。对实例字段的读/写操作可以通过同步和非同步两种方式执行。它可能会导致竞争条件并使输出不可预测。<br>
</p>

<pre class="brush:php;toolbar:false">private final Map<metricname kafkametric> metrics;

public void checkQuotas(long timeMs) {                  // 



<p>分析器警告如下所示:</p>

<p>V6102 “metrics”字段同步不一致。考虑在所有用途上同步该字段。传感器.java 49,传感器.java 254</p>

<p>如果不同的线程可以同时更改实例状态,则允许此操作的方法应该同步。如果程序没有预料到多个线程可以与实例交互,则使其方法同步是没有意义的。最坏的情况下,甚至会损害程序性能。</p>

<p>程序中有很多这样的错误。这是分析器发出警告的类似代码片段:<br>
</p>

<pre class="brush:php;toolbar:false">private final PrefixKeyFormatter prefixKeyFormatter; 

@Override
public synchronized void destroy() {                // (
    prefixKeyFormatter.addPrefix(record.key),
    record.value
    ), batch
  );
} 

@Override
public synchronized void deleteRange(....) {        // 



<p>分析器警告:</p>

<p>V6102 “prefixKeyFormatter”字段同步不一致。考虑在所有用途上同步该字段。 LogicalKeyValueSegment.java 60、LogicalKeyValueSegment.java 247</p>

<h3>
  
  
  Iterator, iterator, and iterator again...
</h3>

<p>In the example, there are two rather unpleasant errors within one line at once. I'll explain their nature within the part of the article. Here's a code snippet:<br>
</p>

<pre class="brush:php;toolbar:false">private final Map<string uuid> topicIds = new HashMap(); 

private Map<string kafkafuturevoid> handleDeleteTopicsUsingNames(....) { 
  ....
  Collection<string> topicNames = new ArrayList(topicNameCollection);

  for (final String topicName : topicNames) {
    KafkaFutureImpl<void> future = new KafkaFutureImpl();

    if (allTopics.remove(topicName) == null) {
      ....
    } else {
      topicNames.remove(topicIds.remove(topicName));      // 



<p>That's what the analyzer shows us:</p>

<p>V6066 The type of object passed as argument is incompatible with the type of collection: String, Uuid. MockAdminClient.java 569</p>

<p>V6053 The 'topicNames' collection of 'ArrayList' type is modified while iteration is in progress. ConcurrentModificationException may occur. MockAdminClient.java 569</p>

<p>Now that's a big dilemma! What's going on here, and how should we address it?! </p>

<p>First, let's talk about collections and generics. Using the generic types of collections helps us avoid <em>ClassCastExceptions</em> and cumbersome constructs where we convert types. </p>

<p>If we specify a certain data type when initializing a collection and add an incompatible type, the compiler won't compile the code. </p>

<p>Here's an example:<br>
</p>

<pre class="brush:php;toolbar:false">public class Test {
  public static void main(String[] args) {
    Set<string> set = new HashSet();
    set.add("str");
    set.add(UUID.randomUUID()); // java.util.UUID cannot be converted to
                                // java.lang.String
  }
}
</string>

However, if we delete an incompatible type from our Set, no exception will be thrown. The method returns false.

Here's an example:

public class Test {
  public static void main(String[] args) {
    Set<string> set = new HashSet();
    set.add("abc");
    set.add("def");
    System.out.println(set.remove(new Integer(13))); // false
  }
}
</string>

It's a waste of time. Most likely, if we encounter something like this in the code, this is an error. I suggest you go back to the code at the beginning of this subchapter and try to spot a similar case.

Second, let's talk about the Iterator. We can talk about iterating through collections for a long time. I don't want to bore you or digress from the main topic, so I'll just cover the key points to ensure we understand why we get the warning.

So, how do we iterate through the collection here? Here is what the for loop in the code fragment looks like:

for (Type collectionElem : collection) {
  ....
}

The for loop entry is just syntactic sugar. The construction is equivalent to this one:

for (Iterator<type> iter = collection.iterator(); iter.hasNext();) {
  Type collectionElem = iter.next();
  ....
}
</type>

We're basically working with the collection iterator. All right, that's sorted! Now, let's discuss ConcurrentModificationException.

ConcurrentModificationException is an exception that covers a range of situations both in single-threaded and multi-threaded programs. Here, we're focusing on single-threading. We can find an explanation quite easily. Let's take a peek at the Oracle docs: a method can throw the exception when it detects parallel modification of an object that doesn't support it. In our case, while the iterator is running, we delete objects from the collection. This may cause the iterator to throw a ConcurrentModificationException.

How does the iterator know when to throw the exception? If we look at the ArrayList collection, we see that its parent, AbstactList, has the modCount field that stores the number of modifications to the collection:

protected transient int modCount = 0;

Here are some usages of the modCount counter in the ArrayList class:

public boolean add(E e) {
  modCount++;
  add(e, elementData, size);
  return true;
}

private void fastRemove(Object[] es, int i) {
  modCount++;
  final int newSize;
  if ((newSize = size - 1) > i)
    System.arraycopy(es, i + 1, es, i, newSize - i);
  es[size = newSize] = null;
}

So, the counter is incremented each time when the collection is modified.

Btw, the fastRemove method is used in the remove method, which we use inside the loop.

Here's the small code fragment of the ArrayList iterator inner workings:

private class Itr implements Iterator<e> {
  ....
  int expectedModCount = modCount;            

  final void checkForComodification() {
  if (modCount != expectedModCount)               // 



<p>Let me explain that last fragment. If the collection modifications don't match the expected number of modifications (which is the sum of the initial modifications before the iterator was created and the number of the iterator operations), a <em>ConcurrentModificationException</em> is thrown. That's only possible when we modify the collection using its methods while iterating over it (i.e. <strong>in parallel</strong> with the iterator). That's what the second warning is about.</p>

<p>So, I've explained you the analyzer messages. Now let's put it all together: </p>

<p>We attempt to delete an element from the collection when the <em>Iterator</em> is still running:<br>
</p>

<pre class="brush:php;toolbar:false">topicNames.remove(topicIds.remove(topicName)); 
// topicsNames – Collection<string>
// topicsIds – Map<string uuid>
</string></string>

However, since the incompatible element is passed to ArrayList for deletion (the remove method returns a UUID object from topicIds), the modification count won't increase, but the object won't be deleted. Simply put, that code section is rudimentary.

I'd venture to guess that the developer's intent is clear. If that's the case, one way to fix these two warnings could be as follows:

Collection<string> topicNames = new ArrayList(topicNameCollection);

List<string> removableItems = new ArrayList();

for (final String topicName : topicNames) {
  KafkaFutureImpl<void> future = new KafkaFutureImpl();

  if (allTopics.remove(topicName) == null) {
    ....
  } else {
    topicIds.remove(topicName);
    removableItems.add(topicName);
    future.complete(null);
  }
  ....
}
topicNames.removeAll(removableItems);
</void></string></string>

Void, sweet void

Where would we go without our all-time favorite null and its potential problems, right? Let me show you the code fragment for which the analyzer issued the following warning:

V6008 Potential null dereference of 'oldMember' in function 'removeStaticMember'. ConsumerGroup.java 311, ConsumerGroup.java 323

@Override
public void removeMember(String memberId) {
  ConsumerGroupMember oldMember = members.remove(memberId);
  ....
  removeStaticMember(oldMember);
  ....
}

private void removeStaticMember(ConsumerGroupMember oldMember) {
  if (oldMember.instanceId() != null) {
    staticMembers.remove(oldMember.instanceId());
  }
}

If members doesn't contain an object with the memberId key, oldMember will be null. It can lead to a NullPointerException in the removeStaticMember method.

Boom! The parameter is checked for null:

if (oldMember != null && oldMember.instanceId() != null) {

The next error will be the last one in the article—I'd like to wrap things up on a positive note. The code below—as well as the one at the beginning of this article—has a common and silly typo. However, it can certainly lead to unpleasant consequences.

Let's take a look at this code fragment:

protected SchemaAndValue roundTrip(...., SchemaAndValue input) {
  String serialized = Values.convertToString(input.schema(),
                                             input.value());

  if (input != null && input.value() != null) {   
    ....
  }
  ....
}

Yeah, that's right. The method actually accesses the input object first, and then checks whether it's referencing null.

V6060 The 'input' reference was utilized before it was verified against null. ValuesTest.java 1212, ValuesTest.java 1213

Again, I'll note that such typos are ok. However, they can lead to some pretty nasty results. It's tough and inefficient to search for these things in the code manually.

Conclusion

In sum, I'd like to circle back to the previous point. Manually searching through the code for all these errors is a very time-consuming and tedious task. It's not unusual for issues like the ones I've shown to lurk in code for a long time. The last bug dates back to 2018. That's why it's a good idea to use static analysis tools. If you'd like to know more about PVS-Studio, the tool we have used to detect all those errors, you can find out more here.

That's all. Let's wrap things up here. "Oh, and in case I don't see ya, good afternoon, good evening, and good night."

Belay the Metamorphosis: analyzing Kafka project

I almost forgot! Catch a link to learn more about a free license for open-source projects.

The above is the detailed content of Belay the Metamorphosis: analyzing Kafka project. For more information, please follow other related articles on the PHP Chinese website!

Statement
The content of this article is voluntarily contributed by netizens, and the copyright belongs to the original author. This site does not assume corresponding legal responsibility. If you find any content suspected of plagiarism or infringement, please contact admin@php.cn
Top 4 JavaScript Frameworks in 2025: React, Angular, Vue, SvelteTop 4 JavaScript Frameworks in 2025: React, Angular, Vue, SvelteMar 07, 2025 pm 06:09 PM

This article analyzes the top four JavaScript frameworks (React, Angular, Vue, Svelte) in 2025, comparing their performance, scalability, and future prospects. While all remain dominant due to strong communities and ecosystems, their relative popul

Spring Boot SnakeYAML 2.0 CVE-2022-1471 Issue FixedSpring Boot SnakeYAML 2.0 CVE-2022-1471 Issue FixedMar 07, 2025 pm 05:52 PM

This article addresses the CVE-2022-1471 vulnerability in SnakeYAML, a critical flaw allowing remote code execution. It details how upgrading Spring Boot applications to SnakeYAML 1.33 or later mitigates this risk, emphasizing that dependency updat

How do I implement multi-level caching in Java applications using libraries like Caffeine or Guava Cache?How do I implement multi-level caching in Java applications using libraries like Caffeine or Guava Cache?Mar 17, 2025 pm 05:44 PM

The article discusses implementing multi-level caching in Java using Caffeine and Guava Cache to enhance application performance. It covers setup, integration, and performance benefits, along with configuration and eviction policy management best pra

Node.js 20: Key Performance Boosts and New FeaturesNode.js 20: Key Performance Boosts and New FeaturesMar 07, 2025 pm 06:12 PM

Node.js 20 significantly enhances performance via V8 engine improvements, notably faster garbage collection and I/O. New features include better WebAssembly support and refined debugging tools, boosting developer productivity and application speed.

How does Java's classloading mechanism work, including different classloaders and their delegation models?How does Java's classloading mechanism work, including different classloaders and their delegation models?Mar 17, 2025 pm 05:35 PM

Java's classloading involves loading, linking, and initializing classes using a hierarchical system with Bootstrap, Extension, and Application classloaders. The parent delegation model ensures core classes are loaded first, affecting custom class loa

Iceberg: The Future of Data Lake TablesIceberg: The Future of Data Lake TablesMar 07, 2025 pm 06:31 PM

Iceberg, an open table format for large analytical datasets, improves data lake performance and scalability. It addresses limitations of Parquet/ORC through internal metadata management, enabling efficient schema evolution, time travel, concurrent w

How to Share Data Between Steps in CucumberHow to Share Data Between Steps in CucumberMar 07, 2025 pm 05:55 PM

This article explores methods for sharing data between Cucumber steps, comparing scenario context, global variables, argument passing, and data structures. It emphasizes best practices for maintainability, including concise context use, descriptive

How can I implement functional programming techniques in Java?How can I implement functional programming techniques in Java?Mar 11, 2025 pm 05:51 PM

This article explores integrating functional programming into Java using lambda expressions, Streams API, method references, and Optional. It highlights benefits like improved code readability and maintainability through conciseness and immutability

See all articles

Hot AI Tools

Undresser.AI Undress

Undresser.AI Undress

AI-powered app for creating realistic nude photos

AI Clothes Remover

AI Clothes Remover

Online AI tool for removing clothes from photos.

Undress AI Tool

Undress AI Tool

Undress images for free

Clothoff.io

Clothoff.io

AI clothes remover

AI Hentai Generator

AI Hentai Generator

Generate AI Hentai for free.

Hot Article

Hot Tools

EditPlus Chinese cracked version

EditPlus Chinese cracked version

Small size, syntax highlighting, does not support code prompt function

SublimeText3 Linux new version

SublimeText3 Linux new version

SublimeText3 Linux latest version

ZendStudio 13.5.1 Mac

ZendStudio 13.5.1 Mac

Powerful PHP integrated development environment

Notepad++7.3.1

Notepad++7.3.1

Easy-to-use and free code editor

SublimeText3 English version

SublimeText3 English version

Recommended: Win version, supports code prompts!