mirror of
git://sourceware.org/git/libabigail.git
synced 2025-03-10 16:47:52 +00:00
457f5cb202
1 Commits
Author | SHA1 | Message | Date | |
---|---|---|---|---|
|
dd55550355 |
reader: Handle 'abi-corpus' element being possibly empty
This problem was reported at https://sourceware.org/bugzilla/show_bug.cgi?id=27616. The abixml reader wrongly assumes that the 'abi-corpus' element is always non-empty. Note that until now, the only emitter of abixml consumed in practice was abg-writer.cc and it only emits non-empty 'abi-corpus' elements. So the issue wasn't exposed. So, the reader assumes that an 'abi-corpus' element has at least a text node. For instance, consider this minimal input file named test-v0.abi: $cat test-v0.abi <abi-corpus-group architecture='elf-arm-aarch64'> <abi-corpus path='vmlinux' architecture='elf-arm-aarch64'> </abi-corpus> </abi-corpus-group> $ Now, compare it to this file where the abi-corpus element is an empty element (doesn't even contain any text): $cat test-v0.abi <abi-corpus-group architecture='elf-arm-aarch64'> <abi-corpus path='vmlinux'/> </abi-corpus-group> $ comparing the two files with abidiff (wrongly) reports: $ abidiff test-v0.abi test-v1.abi ELF architecture changed Functions changes summary: 0 Removed, 0 Changed, 0 Added function Variables changes summary: 0 Removed, 0 Changed, 0 Added variable architecture changed from 'elf-arm-aarch64' to '' $ What's happening is that read_corpus_from_input is getting out early when it sees that the node is empty. This is at: xmlNodePtr node = ctxt.get_corpus_node(); @@ -1907,10 +1925,14 @@ read_corpus_from_input(read_context& ctxt) corp.set_soname(reinterpret_cast<char*>(soname_str.get())); } if (!node->children) // <---- we get out early here and we return nil; // forget about the properties of // the current empty corpus element node So, at its core, fixing the issue at hand involves avoiding the early return there. But then, it turns out that's not enough. In the current setting, the different abixml processing entry points are designed to be used in a semi "streaming" mode. So for instance, read_translation_unit_from_input can be invoked repeatedly to "stream in" the next translation unit at each invocation. Alternatively, the lower level xmlTextReaderNext can be used to iterate over XML node until we reach the translation unit XML element we are interested in. At that point xmlTextReaderExpand can be used to expand the XML node, then we let the context know that this is the current node of the corpus that needs to be processed, using read_context::get_corpus_node. Once we've done that, read_translation_unit_from_input can be called to process that particular corpus node. Note that the corpus node at hand, that needs to be processed will be retrieved by read_context::get_corpus_node. These two modes of operation are also available for read_corpus_from_input, read_symbol_db_from_input, read_elf_needed_from_input etc. Today, these functions all assume that the current node returned by read_context::get_corpus_node is the node /before/ the node of the corpus to be processed. So they all start looking at the /next sibling/ of the node returned by read_context::get_corpus_node. So the code was implicitly assuming that read_context::get_corpus_node was pointing to a text node that was before the node of the corpus that we want to process. This is wrong. read_context::get_corpus_node should just return the current node of the corpus that needs to be processed and voila. And so read_context::set_corpus_node should be used to set the current node of the corpus to the current element node that needs to be processed. That's the spirit of the change done by this patch. As its name suggests, the existing xml::advance_to_next_sibling_element is used to skip non element xml nodes (including text nodes) and move to the next element node to process, which is set to the context using read_context::set_corpus_node. Then the actual processing functions like read_corpus_from_input get the node to process, using read_context::get_corpus_node and process it rather than processing the sibling node that comes after it. The other changes are either to prevent related crashes that I noticed while doing various tests, update the abilint tool used to read and debug abixml input files and add better documentation. * src/abg-reader.cc (read_context::get_corpus_node): Add comment to this member function. (read_translation_unit_from_input, read_symbol_db_from_input) (read_elf_needed_from_input): Start processing the current node of the corpus that needs to be processed rather than its next sibling. Once the processing is done, set the new "current node of the corpus to be processed" properly by skipping to the next element node to be processed. (read_corpus_from_input): Don't get out early when the 'abi-corpus' element is empty. If, however, it has children node, skip to the first child element and flag it -- using read_context::set_corpus_node -- as being the element node to be processed by the processing facilities of the reader. If we are in a mode where we called xmlTextReaderExpand ourselves to get the node to process, then it means we need to free that node indirectly by calling xmlTextReaderNext. In that case, that node should not be flagged by read_context::set_corpus_node. Add more comments. * src/abg-corpus.cc (corpus::is_empty): Do not crash when no symtab is around. * src/abg-libxml-utils.cc (go_to_next_sibling_element_or_stay): Fix typo in comment. (advance_to_next_sibling_element): Don't crash when given a nil node. * tests/data/test-abidiff/test-PR27616-squished-v0.abi: Add new test input. * tests/data/test-abidiff/test-PR27616-squished-v1.abi: Likewise. * tests/data/test-abidiff/test-PR27616-v0.xml: Likewise. * tests/data/test-abidiff/test-PR27616-v1.xml: Likewise. * tests/data/Makefile.am: Add the new test inputs above to source distribution. * tests/test-abidiff.cc (specs): Add the new tests inputs above to this harness. * tools/abilint.cc (main): Support writing corpus groups. Signed-off-by: Dodji Seketeli <dodji@redhat.com> |