我是PHP中复杂数组的新手。我有一个名为$questions
的关联数组,如下所示(供您参考,我只打印这个关联数组的前五个元素,实际数组太大):
Array
(
[0] => Array
(
[question_id] => 24264
[question_parent_id] => 0
[question_subject_id] => 20
[question_topic_id] => 544
[question_directions] =>
[question_text] => Which of the following is the consequence of plant diseases?
[question_file] =>
[question_description] =>
[question_difficulty_type] => 1
[question_has_sub_ques] => 0
[question_picked_individually] => no
[question_appeared_count] => 0
[question_manual] => 0
[question_site_id] =>
[question_created_staff_id] => e516459cde6a92869a887cb99a911cd6
[question_added_date] => 1326866014
[question_updated_staff_id] =>
[question_updated_date] => 0
)
[1] => Array
(
[question_id] => 24269
[question_parent_id] => 0
[question_subject_id] => 20
[question_topic_id] => 544
[question_directions] =>
[question_text] => Viruses enter into their host through
[question_file] =>
[question_description] =>
[question_difficulty_type] => 1
[question_has_sub_ques] => 0
[question_picked_individually] => no
[question_appeared_count] => 0
[question_manual] => 0
[question_site_id] =>
[question_created_staff_id] => e516459cde6a92869a887cb99a911cd6
[question_added_date] => 1326866089
[question_updated_staff_id] =>
[question_updated_date] => 0
)
[2] => Array
(
[question_id] => 24274
[question_parent_id] => 0
[question_subject_id] => 20
[question_topic_id] => 544
[question_directions] =>
[question_text] => which of the following category of plant diseases cannot be controlled by chemical treatment ?
[question_file] =>
[question_description] =>
[question_difficulty_type] => 1
[question_has_sub_ques] => 0
[question_picked_individually] => no
[question_appeared_count] => 0
[question_manual] => 0
[question_site_id] =>
[question_created_staff_id] => e516459cde6a92869a887cb99a911cd6
[question_added_date] => 1326866169
[question_updated_staff_id] =>
[question_updated_date] => 0
)
[3] => Array
(
[question_id] => 24279
[question_parent_id] => 0
[question_subject_id] => 20
[question_topic_id] => 544
[question_directions] =>
[question_text] => Plants can be made disease resistant through
[question_file] =>
[question_description] =>
[question_difficulty_type] => 1
[question_has_sub_ques] => 0
[question_picked_individually] => no
[question_appeared_count] => 0
[question_manual] => 0
[question_site_id] =>
[question_created_staff_id] => e516459cde6a92869a887cb99a911cd6
[question_added_date] => 1326866226
[question_updated_staff_id] =>
[question_updated_date] => 0
)
[4] => Array
(
[question_id] => 24282
[question_parent_id] => 0
[question_subject_id] => 20
[question_topic_id] => 544
[question_directions] =>
[question_text] => Potato famine of Ireland occured in
[question_file] =>
[question_description] =>
[question_difficulty_type] => 1
[question_has_sub_ques] => 0
[question_picked_individually] => no
[question_appeared_count] => 0
[question_manual] => 0
[question_site_id] =>
[question_created_staff_id] => e516459cde6a92869a887cb99a911cd6
[question_added_date] => 1326866259
[question_updated_staff_id] =>
[question_updated_date] => 0
)
)
现在,我想将每个问题的['question_text']
键的值与数组中存在的其他问题的['question_text']
键的值进行比较。我已经为它编写了以下代码,但它几乎没有以下缺点:
- 在比较过程中,问题本身被拿来与自己进行比较,这是不应该做的
我的代码如下:
function GetSimilarQuestionsBySubjectIdTopicId($subject_id, $topic_id) {
$sql = " SELECT * FROM ".TBL_QUESTIONS." WHERE question_subject_id=".$subject_id;
$sql .= " AND question_topic_id=".$topic_id;
$this->mDb->Query($sql);
$questions_data = $this->mDb->FetchArray();
$questions = $questions_data;
$exclude_words = array('the','at','is','are','when','whom');
foreach($questions as $index=>$arr) {
$questions_array = explode(' ',$arr['question_text']);
$clean_questions = array_diff($questions_array, $exclude_words);
$questions[$index]['question_text'] = implode(' ',$clean_questions );
}
foreach ($questions as $index=>$outer_data) {
$outer_data['similar_questions_ids'] = Array();
$outer_question = $outer_data['question_text'];
foreach ($questions as $inner_data) {
$inner_question = $inner_data['question_text'];
$same_chars = similar_text($outer_question, $inner_question, $percent);
$percentage = number_format((float)$percent, 2, '.', '');
if($percentage > 50)
$questions_data[$index]['similar_questions_ids'][] = $inner_data['question_id'];
}
}
}
有人能帮我解决问题吗?此外,如果你有任何想法来优化我现有的代码,它将受到欢迎。
按照@Galden的建议。如果question_id相同,它将避免for/lop操作。谈到优化,删除这条线
$questions = $questions_data;
这是不必要的,并且会使您的脚本变慢。
$questions_data = $this->mDb->FetchArray();
高于线就足够了。使用此变量,而不是在没有充分理由的情况下将一个变量分配给另一个变量。
为了避免将问题与自身进行比较,您可以尝试首先检查问题的ID是否与正在进行比较的ID不同。所以,就在你开始比较之前,添加一些类似的东西
if ($question['question_id'] !== $compared_question['question_id']) {
// Your compare code
}
在你的代码中真的看不到你应该把它放在哪里,但我希望你能明白!
foreach($questions as $index1=>$outer_data) {
foreach($questions as $index2=>$inner_data) {
if($index1 != $index2){ //check the indexes(keys)
//your compare code
}
//。。。希望它能帮助。。。