首页  >  文章  >  Java  >  搜索查询 Apache Solr 中的错误

搜索查询 Apache Solr 中的错误

PHPz
PHPz原创
2024-08-01 03:21:13359浏览

我们再次检查 Apache 产品。这次我们选择了Solr,一个开源搜索服务器平台。 Solr 使您能够快速有效地搜索数据库和在线资源中的信息。当面对如此复杂的任务时,即使对于经验丰富的 Apache 开发人员来说,也很容易犯错误。在本文中,我们将讨论这些类型的错误。

Search query for bugs in Apache Solr

你是谁,索尔?

不久前,我们检查了最著名的 Apache 项目之一,NetBeans IDE。我们在检查过程中发现分析仪发出了许多有趣的警告。顺便说一句,开发人员很快注意到了他们,并在我之前提出了拉取请求:)这一次,我们决定看看他们的另一个大产品,Solr 全文搜索平台。

Apache Solr 于 2006 年首次推出,提供了广泛的功能,从动态集群和数据库集成到处理复杂格式的文档。该搜索引擎使您能够快速搜索和分析网站上的信息,并且还提供在 Linux 上运行的硬件上托管搜索服务器的功能。 Solr 有...

长话短说,我不想用太多细节来打扰你。知道它是一个优化大数据的便捷软件平台就足够了。我们为什么不看看它的源代码并搜索一些有趣或不寻常的东西呢?这正是我们现在要做的。

他们在这里混合了一些东西

程序员永远无法避免涉及混合运算符的拼写错误。这是其中之一:

public Map<String, List<String>> getIndexFilesPathForSnapshot(
    String collectionName, String snapshotName, String pathPrefix)
    throws SolrServerException, IOException {
  ....

  if (meta != null) {                  // <=
    throw new IllegalArgumentException(
                  "The snapshot named " + snapshotName +
                  " is not found for collection " + collectionName);
  }

  DocCollection collectionState = solrClient.getClusterState() 
                                            .getCollection(collectionName);
  for (Slice s : collectionState.getSlices()) {
    List<CoreSnapshotMetaData> replicaSnaps = 
                     meta.getReplicaSnapshotsForShard(s.getName());  // <=
    ....
  }
  return result;
}

此方法使用 meta 变量来存储有关系统快照的信息。上面的代码片段中还检查了 meta != null。如果是这样,则抛出 IllegalArgumentException 。仅此一点就显得很奇怪。好的,现在让我们看看循环体。这是调用 getReplicaSnapshotsForShard 的地方。鉴于这里的 meta 始终为 null,我们得到 NullPointerException。看来开发者只是犯了一个错字,混淆了运算符。因此,如果 meta 等于 null.

则应抛出异常

PVS-Studio 分析器充当校对员并报告检测到的拼写错误:

V6008 “元”的空取消引用。 SolrSnapshotsTool.java 262

我想补充一点,这种混合运算符的错误比您想象的更常见。例如,我已经在 NetBeans 项目中看到至少两个类似的:

private SourcesModel getModel() {
  SourcesModel tm = model.get();
  if (tm == null) {
    tm.sourcePath.removePropertyChangeListener (this);
    tm.debugger.getSmartSteppingFilter ().
    removePropertyChangeListener (this);
  }
  return tm;
}

V6008 'tm' 的空取消引用。 SourcesModel.java 713

public void propertyChange(PropertyChangeEvent evt) {
  ....
  synchronized (this) {
    artifacts = null;
    if (listeners == null && listeners.isEmpty()) {
      return;
    }
    ....
  }
}

V6008 'listeners' 的空解除引用。 MavenArtifactsImplementation.java 613

我们可能发现了新的拼写错误模式。我们将继续关注。

让我们继续下一个片段。在此类中,开发人员混淆了 get 方法之一返回的内容。

public class FunctionQParser extends QParser {
  ....
  boolean parseMultipleSources = true;
  boolean parseToEnd = true;
  ....
  public void setParseMultipleSources(boolean parseMultipleSources) {
    this.parseMultipleSources = parseMultipleSources;
  }

  /** parse multiple comma separated value sources */
  public boolean getParseMultipleSources() {
    return parseMultipleSources;
  }

  public void setParseToEnd(boolean parseToEnd) {
    this.parseToEnd = parseToEnd;
  }

  /** throw exception if there is extra 
      stuff at the end of the parsed valuesource(s). */
  public boolean getParseToEnd() {
    return parseMultipleSources;
  }
  ....
}

该类旨在解析某个书面数学函数。有两个属性可以更改解析器的行为:parseMultipleSources 分析所有数值源,parseToEnd 检查是否应将带有字符串的函数解析到底。

现在让我们看看这些字段的 getset 方法。 parseMultipleSources 字段在 getParseToEnd 中返回。程序员在这里混淆了应该返回什么字段。

分析器可以轻松检测到不匹配的返回字段:

V6091 可疑的 getter 实现。可能应该返回“parseToEnd”字段。 FunctionQParser.java 87, FunctionQParser.java 57

以下代码片段中的拼写错误可能会导致 NullPointerException.

public void stringField(FieldInfo fieldInfo, String value) throws IOException {
  // trim the value if needed
  int len = value != null ? 
             UnicodeUtil.calcUTF16toUTF8Length(value, 0, value.length()) : 0;
  if (value.length() > maxLength) {               // <=    
    value = value.substring(0, maxLength);
  }
  countItem(fieldInfo.name, value, len);
}

让我们仔细看看:value 首先与 null 进行比较,然后在下一行中调用 length() 方法价值。但变量可以是null!最有可能的是,开发人员应该使用 len 变量而不是调用 length().

由于分析器消息,我们发现了这个拼写错误:

V6008 'value' 的潜在空取消引用。 IndexSizeEstimator.java 735, IndexSizeEstimator.java 736

让我们看另一个有拼写错误的代码片段:

public Object doWork(Object value) throws IOException {
  ....
  List<?> list = (List<?>) value;
  // Validate all of same type and are comparable
  Object checkingObject = list.get(0);
  for (int idx = 0; idx < list.size(); ++idx) {
    Object item = list.get(0);                         // <=

    if (null == item) {
      throw new IOException(....);
    } else if (!(item instanceof Comparable<?>)) {
      throw new IOException(....);
    } else if (!item.getClass()
                    .getCanonicalName()
                    .equals(checkingObject.getClass()
                                          .getCanonicalName())) {
       throw new IOException(....);
    }
  }
  ....
}

这里值得关注 for 循环。像往常一样,程序员声明循环和 idx 计数器变量,然后他们获取列表中的项目数。但是,有一个问题:循环的每次迭代仅采用 0 处的元素: list.get(0) 索引。

PVS-Studio 分析器检测到此错误:

V6016 通过循环内的常量索引对“列表”对象的元素进行可疑访问。 AscEvaluator.java 56

The following example shows two methods with different names. However, they do the same thing.

private static List<Feature> makeFeatures(int[] featureIds) {
  final List<Feature> features = new ArrayList<>();
  for (final int i : featureIds) {
    Map<String, Object> params = new HashMap<String, Object>();
    params.put("value", i);
    final Feature f = Feature.getInstance(solrResourceLoader, 
                       ValueFeature.class.getName(), "f" + i, params);
    f.setIndex(i);
    features.add(f);
  }
  return features;
}

private static List<Feature> makeFilterFeatures(int[] featureIds) {
  final List<Feature> features = new ArrayList<>();
  for (final int i : featureIds) {
    Map<String, Object> params = new HashMap<String, Object>();
    params.put("value", i);
    final Feature f = Feature.getInstance(solrResourceLoader, 
                       ValueFeature.class.getName(), "f" + i, params);
    f.setIndex(i);
    features.add(f);
  }
  return features;
}

The first one creates a list of the Feature class objects. The second one, based on the name, should return a different type or filter these Features. If the FilterFeature type existed in the source code, we could assume that the developers simply made a typo. However, there's no such type. Maybe the method was copied and the developers forgot about it after copying it.

Anyway, this snippet looks very suspicious. And the analyzer proves this:

V6032 It is odd that the body of method 'makeFeatures' is fully equivalent to the body of another method 'makeFilterFeatures'. TestLTRScoringQuery.java 66, TestLTRScoringQuery.java 79

Forgot to check? Got error on track

If your gut tells you that null checks are unnecessary, don't trust it. In the code below, the "extra" check could've prevented NullPointerException.

public static Map<String, Object> postProcessCollectionJSON(
                                            Map<String, Object> collection) {
  final Map<String, Map<String, Object>> shards = collection != null   // <=
         ? (Map<String, Map<String, Object>>)
           collection.getOrDefault("shards", Collections.emptyMap())
         : Collections.emptyMap();
  final List<Health> healthStates = new ArrayList<>(shards.size());
  shards.forEach(
  ....
  );
  collection.put("health", Health.combine(healthStates).toString());   // <=
  return collection;
}

In the beginning of the method, the programmer checks if the collection reference is empty. If that's the case, then shards are derived from the collection. The most interesting thing is that healthStates is added to the collection at the end, regardless of whether the collection reference is empty or not.

Here's the analyzer warning for this code fragment:

V6008 Potential null dereference of 'collection'. ClusterStatus.java 303, ClusterStatus.java 335

And in the next example, the developers made an obvious mistake in the class constructor to support parallel distribution of thread work.

public class ParallelStream extends CloudSolrStream 
                            implements Expressible {
  ....
  private transient StreamFactory streamFactory;

  public ParallelStream(String zkHost, 
                        String collection, 
                        String expressionString, 
                        int workers, 
                        StreamComparator comp
) throws IOException {
    TupleStream tStream = this.streamFactory
                              .constructStream(expressionString);  // <=
    init(zkHost, collection, tStream, workers, comp);
  }  
  ....
}

The error lies in the first line of the constructor body. The streamFactory field is accessed here, but the field isn't initialized. The developers may have forgotten to add some logic in the constructor, or accidently may have written this line.

The PVS-Studio warning:

V6090 Field 'streamFactory' is being used before it was initialized. ParallelStream.java 61

However, they didn't forget to add a check in this method. Although, I think they put it in the wrong place.

private void createNewCollection(final String collection)
 throws InterruptedException {
  ....
  pending.add(completionService.submit(call));
  while (pending != null && pending.size() > 0) {
    Future<Object> future = completionService.take();
    if (future == null) return;
    pending.remove(future);
  }
}

Let's look at the interaction with the pending field: first the programmer called add, then they decided to make a loop in which they gradually removed elements from the method. The most interesting thing is that they checked that pending isn't null in the loop condition. It looks very suspicious, considering that there's no variable zeroing in the loop body. Seems like they should've added a check before calling the add method as well.

The analyzer warning:

V6060 The 'pending' reference was utilized before it was verified against null. AbstractBasicDistributedZkTestBase.java 1664, AbstractBasicDistributedZkTestBase.java 1665

Lost exception

Like many modern languages, Java has exception handling feature. The most important thing is to not lose them, as it happened here.

private void doSplitShardWithRule(SolrIndexSplitter.SplitMethod splitMethod) 
 throws Exception {
  ....
  try {
    ZkStateReader.from(cloudClient)
                 .waitForState(collectionName, 30, 
                             TimeUnit.SECONDS,
                             SolrCloudTestCase.activeClusterShape(1, 2));
  } catch (TimeoutException e) {
    new RuntimeException("Timeout waiting for
                          1shards and 2 replicas.", e);     // <=
  }
  ....
}

The error lies in the catch block: the developers created the RuntimeException object there, even added a link to the current intercepted TimeoutException and a message. But they forgot to write the throw keyword. So, the exception is never thrown.

The Lost and Found Bureau, in the form of our analyzer, found the lost exception and notified us about it:

V6006 The object was created but it is not being used. The 'throw' keyword could be missing. ShardSplitTest.java 773

How arithmetic errors interfere with testing

Does testing make software less buggy? Well, humans are the ones who write tests, and they can't help but make mistakes. This is what happened in the following example.

Public class SpellCheckCollatorTest extends SolrTestCaseJ4 {
  private static final int NUM_DOCS_WITH_TERM_EVERYOTHER = 8;
  private static final int NUM_DOCS = 17;
  ....
  @Test
  public void testEstimatedHitCounts() {
    ....
    for (int val = 5; val <= 20; val++) {
      String hitsXPath = xpathPrefix + "long[@name='hits']"; 

      if (val <= NUM_DOCS_WITH_TERM_EVERYOTHER) {
        int max = NUM_DOCS;
        int min = (/* min collected */ val) / 
                  (/* max docs possibly scanned */ NUM_DOCS);
        hitsXPath += "[" + min + " <= . and . <= " + max + "]";
      } 
    ....
    }
  }
  ....
}

A string containing the min variable, which in turn is the result of dividing val by NUM_DOCS, is written to hitsXPath here. Looking closer, you can see that the maximum and minimum values of val in this fragment are 8 and 5. The NUM_DOCS value is always 17. In all cases, min is zero in integer division. Most likely, the programmer forgot to convert division arguments to real numbers and change the type of the min variable.

We found this error using a brand-new diagnostic rule in the PVS-Studio analyzer:

V6113 The '(val) / (NUM_DOCS)' expression evaluates to 0 because the absolute value of the left operand 'val' is less than the value of the right operand 'NUM_DOCS'. SpellCheckCollatorTest.java 683

Danger of checking objects by reference

The class bellow describes the equalsTo comparison method.

private static class RandomQuery extends Query {
  private final long seed;
  private float density;
  private final List<BytesRef> docValues;
  ....
  private boolean equalsTo(RandomQuery other) {
    return seed == other.seed && 
           docValues == other.docValues && 
           density == other.density;
  }
}

Comparisons of the seed and density fields almost don't cause any questions (except, perhaps, for the density field that is a real number), because the values directly written into them are considered. However, since this field has a reference type, the docValues comparison via '==' looks very dubious. This check considers only the address and not the internal state of the object.

With such defect you can miss the case when two different lists store the same values because the lists are the same, but the references are different. It seems that when the developers named the equalsTo method, they hardly meant that it should compare references rather than the internal state of objects.

The PVS-Studio analyzer warning:

V6013 Objects 'docValues' and 'other.docValues' are compared by reference. Possibly an equality comparison was intended. TestFieldCacheSortRandom.java 341

Suspicious lack of synchronization

You won't find an error in the next fragment, but it's still potentially there. How's that possible? Take a look at this code and find out why.

public abstract class CachingDirectoryFactory extends DirectoryFactory {
  ....
  private static final Logger log = LoggerFactory.getLogger(....);
  protected Map<String, CacheValue> byPathCache = new HashMap<>();
  protected IdentityHashMap<Directory, CacheValue> byDirectoryCache = 
                                                  new IdentityHashMap<>();
  ....

  private void removeFromCache(CacheValue v) {
    log.debug("Removing from cache: {}", v);
    byDirectoryCache.remove(v.directory);
    byPathCache.remove(v.path);
  }
}

We won't be able to understand what's wrong here until we look at all the uses of the byDirectoryCache variable. In all other methods, the interaction occurs in the synchronized block. However, in the removeFromCache method, the programmer removes the collection elements outside of the synchronized blocks.

The analyzer detected this suspicious fragment:

V6102 Inconsistent synchronization of the 'byDirectoryCache' field. Consider synchronizing the field on all usages. CachingDirectoryFactory.java 92, CachingDirectoryFactory.java 228

At this point, one could say there's an error here, and the race condition could happen. However, it turns out that all removeFromCache calls are also enclosed in synchronized blocks. So, this is mostly a false positive that could be suppressed.

Although, we still can enhance this code, because there's a potential issue here. For example, when you need to use this method again, you may simply forget to enclose it in a synchronized block. Even though other methods have additional checks that the object exists in the byDirectoryCache collection, an unsynchronized call may delete an element already after the check. As a result, unnecessary actions are performed in another thread with a non-existent element of the collection, which can lead to errors in the program logic.

To protect ourselves from this, we can simply add the synchronized keyword to the removeFromCache method. So, even though there's no real error here, the static analyzer still urges us to write cleaner code.

By the way, we just recently released an article on the pitfalls of using synchronization.

Is it possible to create a few classes named the same?

In this fragment, the programmer didn't consider that classes can be renamed or declared with the same name in different packages.

private static String getFieldFlags(IndexableField f) {
  IndexOptions opts = (f == null) ? null : f.fieldType().indexOptions();

  StringBuilder flags = new StringBuilder();
  ....
  flags.append((f != null && f.getClass()
                              .getSimpleName()
                              .equals("LazyField"))  // <=
                                   ? FieldFlag.LAZY.getAbbreviation(): '-');
  ....
  return flags.toString();
}

This is an obviously unnecessary operation that may result in an error. The f variable has the getClass method called, which returns the object type, then gets and checks the name without specifying packages. All in all, there's no error here right now. However, it can arise for two reasons.

The first one is that there may be classes with the same name in different packages. In this case, it's unclear what kind of LazyField is required, and the program will run in a different way than intended.

The second one is related to changing the class name. If the name is changed, the code won't run as intended at all. And searching for all such strings in a huge code base is very difficult. Even if you resort to searching, it's something you can just forget about.

It'd be much safer to use the instanceof operator:

flags.append(f instanceof LazyDocument.LazyField
             ? FieldFlag.LAZY.getAbbreviation(): '-');

In this case, we wouldn't need to check for null, and the code would be much shorter. The chance of an error would also decrease, if there are classes with the same name in different packages, or if the name of the class changes.

The analyzer detected a potential error and issued a warning:

V6054 Classes should not be compared by their name. LukeRequestHandler.java 247

What about documentation, though?

As a final fragment, we'll look at the following code:

@Override
public UpdateCommand clone() {
  try {
    return (UpdateCommand) super.clone();
  } catch (CloneNotSupportedException e) {
    return null;                         // <=
  }
}

Let's see what's wrong with it, because everything seems fine at first glance. The analyzer informs us that returning null in the clone method is a bad idea:

V6073 It is not recommended to return null from 'clone' method. UpdateCommand.java 97

Why is it not recommended to return null from clone? It's time to consult the Java documentation:

Returns:

a clone of this instance.

Throws:

CloneNotSupportedException - if the object's class does not support the Cloneable interface. Subclasses that override the clone method can also throw this exception to indicate that an instance cannot be cloned.

The exception here indicates that the object can't be cloned. The method should return only a copy of the current object and nothing else. But why the analyzer doesn't recommend returning null from clone? It's all about further use of the code. If you constantly deviate from the recommendations in the documentation, it's difficult to catch non-standard situations.

Let's imagine a scenario where we want to use the UpdateCommand class, but the source code is unavailable, and we can't decompile it. Or we're just lazy. We can only use the built-in library with this class and focus on the interface. Our program needs us to use the clone method, so we write the following code:

try {
  UpdateCommand localCopy = field.clone(); 
  System.out.println(localCopy.toString();
} catch (CloneNotSupportedException e) {
  System.out.println("Could not clone the field"); 
}

In this code, we try to catch the CloneNotSupportedException, but we can't because the exception is a NullPointerException that causes the program to crash when calling localCopy.ToString(). This comes as a complete surprise to the developer. Deviating from official recommendations can be annoying, so it's better to always follow them

Conclusion

Let's stop here and take another look at the errors we found. Most of them are the result of carelessness, but there are some that require additional thought. For example, comparing class names without considering packages, or returning null instead of throwing an exception in the clone method.

Without special development tools like static analyzers, such bugs are difficult to find, especially in projects as large as Apache Solr. If you'd like to search for such non-obvious errors in your project, you may try our static analyzer here.

By the way, Solr isn't the only Apache product we checked:

  • 21 bugs in 21st version of Apache NetBeans
  • Big / Bug Data: analyzing the Apache Flink source code
  • Apache Hadoop code quality: production vs test

以上是搜索查询 Apache Solr 中的错误的详细内容。更多信息请关注PHP中文网其他相关文章!

声明:
本文内容由网友自发贡献,版权归原作者所有,本站不承担相应法律责任。如您发现有涉嫌抄袭侵权的内容,请联系admin@php.cn