preg_match表单并获取字段名称


preg_match form and get name of field

我有这样的多种形式:

$string = '
Form number 1
<form class="form-search" method="post" action="/index.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="pn" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 2
<form class="form-search" method="post" action="/home.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="y" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 3
<form class="form-search" method="post" action="/index.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="x" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 4
<form class="form-search" method="post" action="/contact.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="c" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 5
<form class="form-search" method="post" action="/index.php">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="v" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
Form number 6
<form class="form-search" method="post" action="/index.php?a=v">
  <div class="form-group">
    <input id="address_box" type="text" class="form-control" name="k" value="" onfocus="this.select()" />
  </div>
<span class="btn btn-s btn-caps"><input type="submit" value="start" /></span>
</form>
';

我想:

Preg_match:
START = <form
WHERE action CONTAIN /index.php but nothing after it
EX: action="/index.php" or action="http://whatever.com/index.php"
    can't be action="/index.php?s=w"
FIND name="[A-Za-z]{1}"
END = </form>

对每个表单重复此操作,直到找到(第一个)匹配的表单,然后输出[A-Za-z]{1}匹配

这是代码:

$pat = '~<form[^>]+action="[^"]*/(?:index.php)"[^>]*>.*?name="([a-zA-Z]{1})".*?</form>~s';
preg_match($pat,$string,$match);
echo $match[1];

它应该选择匹配的表格(编号3)并输出=x

但我得到的输出=y(表格编号2)

请帮忙吗?

谢谢。

XPath方式:

$dom = new DOMDocument;
@$dom->loadHTML($html);
$xp = new DOMXPath($dom);
$query = '//form[substring(@action, string-length(@action) - 10) = "/index.php"]'
       . '/div/input/@name[string-length(.)=1]';
$nameList = $xp->query($query);
foreach($nameList as $nameNode) {
    $char = $nameNode->nodeValue;
    $ascii = ord(strtolower($char));
    // check if it is a letter with its ascii code
    if ($ascii < 123 && $ascii > 60) {
        $result = $char;
        break;
    }
}
echo $result;

XPath被设计为针对DOM树(html文档的树表示)中的一个或多个元素。因此,//elt1/elt2/elt3定义了路径(其中elt1、elt2…是标记),方括号之间的所有内容都是当前节点的条件。

//    # from everywhere in the DOM tree
form  # a form tag
[     # condition for the current element (the form tag):
      # must have an attribute "action" that ends with "/index.php".
      # In other words: the last 10 characters of the "action" attribute
      # must be "/index.php"
  substring(@action, string-length(@action) - 10) = "/index.php"
]
      # lets continue the path until the name attribute of the input tag
/div/input/@name
      # condition for the name attribute
      # . is the current node, it must be one character length
[string-length(.)=1]'