Skip to main content
 首页 » 编程设计

xml之Perl XML::XPath 在文档中添加一堆垃圾

2025年12月25日24lexus

我有一个想要通过 XPATH 更新的 web.xml。我注意到所需的元素已正确修改,但在文档的开头添加了一堆垃圾。我注意到即使我不修改任何元素,只解析和打印,我也会得到那个垃圾。

编码:

require Cwd; 
use File::Temp qw/ tempfile tempdir/; 
use lib 'menu/perl-modules/lib/site_perl'; 
use XML::XPath; 
use XML::XPath::NodeSet; 
#use strict; 
 
$file = "/tmp/web.xml"; 
my $xp   = XML::XPath->new( filename => $file ); 
my $root = $xp->find('/')->get_nodelist; 
#$xp->setNodeText( $xpath, $newValue ); 
 
open( XPATH_FILE, "> $file" ); 
foreach my $nodes ( $xp->find('/')->get_nodelist ) { 
  print XPATH_FILE $nodes->toString; 
} 
close(XPATH_FILE); 

输入文件:
<!DOCTYPE web-app PUBLIC 
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" 
  "http://java.sun.com/dtd/web-app_2_3.dtd" > 
<web-app> 
   <filter> 
      <filter-name>LocaleFilter</filter-name> 
      .... 
</web-app> 

输出:文档开头的大约 700 行注释,看起来像是引用的 dtd 的某种扩展或其他东西。为了便于阅读,我只包括前几行:
<!-- 
DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS HEADER. 
 
Copyright 2000-2007 Sun Microsystems, Inc. All rights reserved. 
 
The contents of this file are subject to the terms of either the GNU 
General Public License Version 2 only ("GPL") or the Common Development 
and Distribution License("CDDL") (collectively, the "License").  You 
may not use this file except in compliance with the License. You can obtain 
a copy of the License at https://glassfish.dev.java.net/public/CDDL+GPL.html 
or glassfish/bootstrap/legal/LICENSE.txt.  See the License for the specific 
language governing permissions and limitations under the License. 
 
When distributing the software, include this License Header Notice in each 
file and include the License file at glassfish/bootstrap/legal/LICENSE.txt. 
Sun designates this particular file as subject to the "Classpath" exception 
as provided by Sun in the GPL Version 2 section of the License file that 
accompanied this code.  If applicable, add the following below the License 
Header, with the fields enclosed by brackets [] replaced by your own 
identifying information: "Portions Copyrighted [year] 
[name of copyright owner]" 
 
Contributor(s): 
 
If you wish your version of this file to be governed by only the CDDL or 
only the GPL Version 2, indicate your decision by adding "[Contributor] 
elects to include this software in this distribution under the [CDDL or GPL 
Version 2] license."  If you don't indicate a single choice of license, a 
recipient has the option to distribute your version of this file under 
either the CDDL, the GPL Version 2 or to extend the choice of license to 
its licensees as provided above.  However, if you add GPL Version 2 code 
and therefore, elected the GPL Version 2 license, then the option applies 
only if the new code is made subject to such option by the copyright 
holder. 
--><!-- 
This is the XML DTD for the Servlet 2.3 deployment descriptor. 

请您参考如下方法:

我不明白为什么这个模块要考虑所有链接的 DTD 文档,因为据我所知,它没有进行有效性检查。

此外,虽然该模块允许更改和添加到文档的节点,但没有明显的方法来删除节点。

但是,您要排除的注释是根节点的子节点,因此可以通过在根节点的唯一元素子节点上重新生成文档来有效地删除它们。

这段代码演示

use strict; 
use warnings; 
use autodie; 
use 5.010; 
 
use XML::XPath; 
 
my $xp   = XML::XPath->new( ioref => *DATA ); 
my ($new_root) = $xp->findnodes('/*'); 
 
print $new_root->toString, "\n"; 
 
__DATA__ 
<!DOCTYPE web-app PUBLIC 
 "-//Sun Microsystems, Inc.//DTD Web Application 2.3//EN" 
  "http://java.sun.com/dtd/web-app_2_3.dtd" > 
<web-app> 
  <filter> 
    <filter-name>LocaleFilter</filter-name> 
  </filter> 
</web-app> 

输出
<web-app> 
  <filter> 
    <filter-name>LocaleFilter</filter-name> 
  </filter> 
</web-app>