保护变形：分析 Kafka 项目-java教程-PHP中文网

首页

Java

java教程

保护变形：分析 Kafka 项目

Patricia Arquette

Oct 16, 2024 pm 08:09 PM

您有没有想过跨国公司的项目源代码中可能潜藏着哪些错误？不要错过在开源 Apache Kafka 项目中发现 PVS-Studio 静态分析器检测到的有趣错误的机会。

Belay the Metamorphosis: analyzing Kafka project

介绍

Apache Kafka 是一个著名的开源项目，主要用 Java 编写。 LinkedIn 于 2011 年将其开发为消息代理，即各种系统组件的数据管道。如今，它已成为同类产品中最受欢迎的解决方案之一。

准备好看看引擎盖下的内容了吗？

附注

只是想简单说明一下标题。它参考了弗朗茨·卡夫卡的《变形记》，其中主角变成了可怕的害虫。我们的静态分析器致力于防止您的项目~~变身为可怕的害虫~~转变为一个巨大的错误，所以对“变形记”说不。

哦不，虫子

所有的幽默都源于痛苦

这不是我的话；这句话出自理查德·普赖尔之口。但这有什么关系呢？我想告诉你的第一件事是一个愚蠢的错误。然而，在多次尝试理解程序无法正常运行的原因后，遇到如下示例的情况令人沮丧：

@Override
public KeyValueIterator<windowed>, V> backwardFetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  if (keyFrom == null && keyFrom == null) {   // 



<p>如您所见，这是任何开发人员都无法避免的事情——一个微不足道的拼写错误。在第一个条件下，开发人员希望使用以下逻辑表达式：<br>
</p>

<pre class="brush:php;toolbar:false">keyFrom == null && keyTo == null

分析器发出两个警告：

V6001 在“&&”运算符的左侧和右侧有相同的子表达式“keyFrom == null”。 ReadOnlyWindowStoreStub.java 327、ReadOnlyWindowStoreStub.java 327

V6007 表达式“keyFrom == null”始终为 false。 ReadOnlyWindowStoreStub.java 329

我们可以明白为什么。对于每个开发人员来说，这种可笑的打字错误都是永恒的。虽然我们可以花很多时间寻找它们，但要回忆起它们潜伏的地方可不是小菜一碟。

在同一个类中，另一个方法中存在完全相同的错误。我认为称其为复制面食是公平的。

@Override
public KeyValueIterator<windowed>, V> fetch(
  K keyFrom,
  K keyTo,
  Instant timeFrom,
  Instant timeTo) {
  ....
  NavigableMap<k v> kvMap = data.get(now);
  if (kvMap != null) {
    NavigableMap<k v> kvSubMap;
    if (keyFrom == null && keyFrom == null) {      // 



<p>以下是相同的警告：</p>

<p>V6007 表达式“keyFrom == null”始终为 false。 ReadOnlyWindowStoreStub.java 273 </p>

<p>V6001 在“&&”运算符的左侧和右侧有相同的子表达式“keyFrom == null”。 ReadOnlyWindowStoreStub.java 271, ReadOnlyWindowStoreStub.java 271</p>

<p>不用担心——我们不必一次查看数百行代码。 PVS-Studio 非常擅长处理此类简单的事情。解决一些更具挑战性的事情怎么样？</p>

<h3>
  
  
  可变同步
</h3>

<p>Java 中 <em>synchronized</em> 关键字的用途是什么？在这里，我将只关注同步方法，而不是块。根据 Oracle 文档，<em>synchronized</em> 关键字将方法声明为同步，以确保与实例的线程安全交互。如果一个线程调用该实例的同步方法，则尝试调用同一实例的同步方法的其他线程将被阻塞（即它们的执行将被挂起）。它们将被阻塞，直到第一个线程调用的方法处理其执行。当实例对多个线程可见时，需要执行此操作。此类实例的读/写操作只能通过同步方法执行。 </p>

<p>开发人员违反了 <em>Sensor</em> 类中的规则，如下面的简化代码片段所示。对实例字段的读/写操作可以通过同步和非同步两种方式执行。它可能会导致竞争条件并使输出不可预测。<br>
</p>

<pre class="brush:php;toolbar:false">private final Map<metricname kafkametric> metrics;

public void checkQuotas(long timeMs) {                  // 



<p>分析器警告如下所示：</p>

<p>V6102 “metrics”字段同步不一致。考虑在所有用途上同步该字段。传感器.java 49，传感器.java 254</p>

<p>如果不同的线程可以同时更改实例状态，则允许此操作的方法应该同步。如果程序没有预料到多个线程可以与实例交互，则使其方法同步是没有意义的。最坏的情况下，甚至会损害程序性能。</p>

<p>程序中有很多这样的错误。这是分析器发出警告的类似代码片段：<br>
</p>

<pre class="brush:php;toolbar:false">private final PrefixKeyFormatter prefixKeyFormatter; 

@Override
public synchronized void destroy() {                // (
    prefixKeyFormatter.addPrefix(record.key),
    record.value
    ), batch
  );
} 

@Override
public synchronized void deleteRange(....) {        // 



<p>分析器警告：</p>

<p>V6102 “prefixKeyFormatter”字段同步不一致。考虑在所有用途上同步该字段。 LogicalKeyValueSegment.java 60、LogicalKeyValueSegment.java 247</p>

<h3>
  
  
  Iterator, iterator, and iterator again...
</h3>

<p>In the example, there are two rather unpleasant errors within one line at once. I'll explain their nature within the part of the article. Here's a code snippet:<br>
</p>

<pre class="brush:php;toolbar:false">private final Map<string uuid> topicIds = new HashMap(); 

private Map<string kafkafuturevoid> handleDeleteTopicsUsingNames(....) { 
  ....
  Collection<string> topicNames = new ArrayList(topicNameCollection);

  for (final String topicName : topicNames) {
    KafkaFutureImpl<void> future = new KafkaFutureImpl();

    if (allTopics.remove(topicName) == null) {
      ....
    } else {
      topicNames.remove(topicIds.remove(topicName));      // 



<p>That's what the analyzer shows us:</p>

<p>V6066 The type of object passed as argument is incompatible with the type of collection: String, Uuid. MockAdminClient.java 569</p>

<p>V6053 The 'topicNames' collection of 'ArrayList' type is modified while iteration is in progress. ConcurrentModificationException may occur. MockAdminClient.java 569</p>

<p>Now that's a big dilemma! What's going on here, and how should we address it?! </p>

<p>First, let's talk about collections and generics. Using the generic types of collections helps us avoid <em>ClassCastExceptions</em> and cumbersome constructs where we convert types. </p>

<p>If we specify a certain data type when initializing a collection and add an incompatible type, the compiler won't compile the code. </p>

<p>Here's an example:<br>
</p>

<pre class="brush:php;toolbar:false">public class Test {
  public static void main(String[] args) {
    Set<string> set = new HashSet();
    set.add("str");
    set.add(UUID.randomUUID()); // java.util.UUID cannot be converted to
                                // java.lang.String
  }
}
</string>

However, if we delete an incompatible type from our Set, no exception will be thrown. The method returns false.

Here's an example:

public class Test {
  public static void main(String[] args) {
    Set<string> set = new HashSet();
    set.add("abc");
    set.add("def");
    System.out.println(set.remove(new Integer(13))); // false
  }
}
</string>

It's a waste of time. Most likely, if we encounter something like this in the code, this is an error. I suggest you go back to the code at the beginning of this subchapter and try to spot a similar case.

Second, let's talk about the Iterator. We can talk about iterating through collections for a long time. I don't want to bore you or digress from the main topic, so I'll just cover the key points to ensure we understand why we get the warning.

So, how do we iterate through the collection here? Here is what the for loop in the code fragment looks like:

for (Type collectionElem : collection) {
  ....
}

The for loop entry is just syntactic sugar. The construction is equivalent to this one:

for (Iterator<type> iter = collection.iterator(); iter.hasNext();) {
  Type collectionElem = iter.next();
  ....
}
</type>

We're basically working with the collection iterator. All right, that's sorted! Now, let's discuss ConcurrentModificationException.

ConcurrentModificationException is an exception that covers a range of situations both in single-threaded and multi-threaded programs. Here, we're focusing on single-threading. We can find an explanation quite easily. Let's take a peek at the Oracle docs: a method can throw the exception when it detects parallel modification of an object that doesn't support it. In our case, while the iterator is running, we delete objects from the collection. This may cause the iterator to throw a ConcurrentModificationException.

How does the iterator know when to throw the exception? If we look at the ArrayList collection, we see that its parent, AbstactList, has the modCount field that stores the number of modifications to the collection:

protected transient int modCount = 0;

Here are some usages of the modCount counter in the ArrayList class:

public boolean add(E e) {
  modCount++;
  add(e, elementData, size);
  return true;
}

private void fastRemove(Object[] es, int i) {
  modCount++;
  final int newSize;
  if ((newSize = size - 1) > i)
    System.arraycopy(es, i + 1, es, i, newSize - i);
  es[size = newSize] = null;
}

So, the counter is incremented each time when the collection is modified.

Btw, the fastRemove method is used in the remove method, which we use inside the loop.

Here's the small code fragment of the ArrayList iterator inner workings:

private class Itr implements Iterator<e> {
  ....
  int expectedModCount = modCount;            

  final void checkForComodification() {
  if (modCount != expectedModCount)               // 



<p>Let me explain that last fragment. If the collection modifications don't match the expected number of modifications (which is the sum of the initial modifications before the iterator was created and the number of the iterator operations), a <em>ConcurrentModificationException</em> is thrown. That's only possible when we modify the collection using its methods while iterating over it (i.e. <strong>in parallel</strong> with the iterator). That's what the second warning is about.</p>

<p>So, I've explained you the analyzer messages. Now let's put it all together: </p>

<p>We attempt to delete an element from the collection when the <em>Iterator</em> is still running:<br>
</p>

<pre class="brush:php;toolbar:false">topicNames.remove(topicIds.remove(topicName)); 
// topicsNames – Collection<string>
// topicsIds – Map<string uuid>
</string></string>

However, since the incompatible element is passed to ArrayList for deletion (the remove method returns a UUID object from topicIds), the modification count won't increase, but the object won't be deleted. Simply put, that code section is rudimentary.

I'd venture to guess that the developer's intent is clear. If that's the case, one way to fix these two warnings could be as follows:

Collection<string> topicNames = new ArrayList(topicNameCollection);

List<string> removableItems = new ArrayList();

for (final String topicName : topicNames) {
  KafkaFutureImpl<void> future = new KafkaFutureImpl();

  if (allTopics.remove(topicName) == null) {
    ....
  } else {
    topicIds.remove(topicName);
    removableItems.add(topicName);
    future.complete(null);
  }
  ....
}
topicNames.removeAll(removableItems);
</void></string></string>

Void, sweet void

Where would we go without our all-time favorite null and its potential problems, right? Let me show you the code fragment for which the analyzer issued the following warning:

V6008 Potential null dereference of 'oldMember' in function 'removeStaticMember'. ConsumerGroup.java 311, ConsumerGroup.java 323

@Override
public void removeMember(String memberId) {
  ConsumerGroupMember oldMember = members.remove(memberId);
  ....
  removeStaticMember(oldMember);
  ....
}

private void removeStaticMember(ConsumerGroupMember oldMember) {
  if (oldMember.instanceId() != null) {
    staticMembers.remove(oldMember.instanceId());
  }
}

If members doesn't contain an object with the memberId key, oldMember will be null. It can lead to a NullPointerException in the removeStaticMember method.

Boom! The parameter is checked for null:

if (oldMember != null && oldMember.instanceId() != null) {

The next error will be the last one in the article—I'd like to wrap things up on a positive note. The code below—as well as the one at the beginning of this article—has a common and silly typo. However, it can certainly lead to unpleasant consequences.

Let's take a look at this code fragment:

protected SchemaAndValue roundTrip(...., SchemaAndValue input) {
  String serialized = Values.convertToString(input.schema(),
                                             input.value());

  if (input != null && input.value() != null) {   
    ....
  }
  ....
}

Yeah, that's right. The method actually accesses the input object first, and then checks whether it's referencing null.

V6060 The 'input' reference was utilized before it was verified against null. ValuesTest.java 1212, ValuesTest.java 1213

Again, I'll note that such typos are ok. However, they can lead to some pretty nasty results. It's tough and inefficient to search for these things in the code manually.

Conclusion

In sum, I'd like to circle back to the previous point. Manually searching through the code for all these errors is a very time-consuming and tedious task. It's not unusual for issues like the ones I've shown to lurk in code for a long time. The last bug dates back to 2018. That's why it's a good idea to use static analysis tools. If you'd like to know more about PVS-Studio, the tool we have used to detect all those errors, you can find out more here.

That's all. Let's wrap things up here. "Oh, and in case I don't see ya, good afternoon, good evening, and good night."

Belay the Metamorphosis: analyzing Kafka project

I almost forgot! Catch a link to learn more about a free license for open-source projects.

以上是保护变形：分析 Kafka 项目的详细内容。更多信息请关注PHP中文网其他相关文章！

声明

本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容，请联系admin@php.cn

java主要是干嘛的 Java在实际开发中的主要用途解析May 16, 2025 pm 02:54 PM

Java主要用于构建桌面应用、移动应用、企业级解决方案和大数据处理。1.企业级应用：通过JavaEE支持复杂应用，如银行系统。2.Web开发：使用Spring、Hibernate简化开发，SpringBoot快速搭建微服务。3.移动应用：仍是Android开发主要语言之一。4.大数据处理：Hadoop和Spark基于Java处理海量数据。5.游戏开发：适用于中小型游戏开发，如Minecraft。

java怎么设置为中文 Java开发工具中文界面设置教程May 16, 2025 pm 02:51 PM

如何将Java开发工具设置为中文界面？可以通过以下步骤实现：Eclipse：Window->Preferences->General->Appearance->I18nsupport->Language->Chinese(Simplified)，然后重启Eclipse。IntelliJIDEA：Help->FindAction->输入"switchlanguage"->选择"SwitchIDELanguage&q

学java要学多久才能工作 Java学习周期和就业时间预估May 16, 2025 pm 02:48 PM

学习Java并达到工作水平通常需要6到12个月，对于有编程基础的人可能缩短至3到6个月。1)零基础学习者需6-12个月掌握基础和常用库。2)有编程基础者可能3-6个月内掌握。3)就业时间在学习9-18个月后，实际项目和实习可加速进程。

java中的new是什么 new操作符的内存分配过程May 16, 2025 pm 02:45 PM

在Java中，new操作符用于创建对象，其过程包括：1）在堆内存中分配空间，2）初始化对象，3）调用构造函数，4）返回对象引用。理解这些步骤有助于优化内存使用和提升应用程序性能。

java中数组如何定义数组声明的语法格式说明May 16, 2025 pm 02:42 PM

在Java中定义数组的语法是：1.数据类型[]数组名=new数据类型[数组长度];2.数据类型数组名[]=new数据类型[数组长度];3.数据类型[]数组名={元素列表};数组是对象，可为null，下标从0开始，使用时需注意潜在的错误如NullPointerException和ArrayIndexOutOfBoundsException。