User Tools

  • Logged in as: anonymous (anonymous)
  • Log Out

Site Tools


mantisbt:database_optimization

This is an old revision of the document!


This page describes some database optimisations for SQL Server 2000. These are based on Mantis 1.0.2

The Problem

The project I was working on used a Mantis installation based on Windows 2003 Server/IIS/PHP and SQL Server. The filters were starting to become unusably slow, particularly the view_all_bug_page. Pressing the “Rest Filter” button could produce a delay of 40-50 seconds before a page appeared. This problem was getting worse as we added more and more issues.

The Investigation

The investigation of the problem centered on the filter code in code/filter_api.php. The function filter_get_bug_rows() is the core of the bug filtering system.

The existing filter_get_bug_rows() function is 813 lines of code - arguably a candidate for some “refactoring with extreme prejudice” however that is beyond the scope of what I was trying to do here, which was to make MANTIS work acceptably under SQL server with more than 1000 open issues.

What is interesting is the area after line 724 in the original file. What happens here is that a list of relevant issues is created in the variable $t_id_array. This is then converted into a where clause at line 756. This produces a SQL statement which will be used to query mantis_bug_table of the form: -

SELECT ... FROM mantis_bug_table JOIN ...... 
  WHERE 
  mantis_bug_table.id IN ( 1, 10, 11, ...... long list OF bug ids ... )

The key observation that I made was that SQL Server hates long in (…) lists. When your project has several thousand open issues, this in (…) clause will contain several thousand numbers. The result, in SQL Server at least, is misery.

The Solution

Given that the problem was caused by the excessively long in() clause, my solution was to convert the in(…) clause to a temporary table, so that the SQL show above becomes…

SELECT ... FROM mantis_bug_table JOIN ...... 
 #filtered_ids fid ON fid.id = mantis_bug_table.id

Where the #filtered_ids table contains the list of bug_ids that were previously in the in(…) clause.

The full text of my solution is shown below: -

Firstly replace the code after line 724 with the following : -

		# Possibly do two passes. First time, grab the IDs of issues that match the filters. Second time, grab the IDs of issues that
		# have bugnotes that match the text search if necessary.

		#
		# Using a long list of bug id's for subsequent select statements does not seem to work well in SQL server
		# so.... we will select the list of bug id's into a temp table
		#
		db_query("if exists (select * from tempdb..sysobjects where id = object_id(N'tempdb..#filtered_ids') )".
		         "drop table [dbo].[#filtered_ids]" );
		db_query( "create table #filtered_ids ( fid int not null)" );
		db_query( "create clustered index ix_filtered_ids on #filtered_ids(fid)" );
 
		$t_id_array = array();
 
		$t_id_where = $t_where;
		$t_id_join = $t_join;
		if ( !is_blank( $t_id_where ) && !is_blank( $t_textsearch_where_clause ) ) {
				$t_id_where = $t_id_where . ' AND ' . $t_textsearch_where_clause;
		}
		$query  = "insert into #filtered_ids SELECT DISTINCT $t_bug_table.id as id
						$t_from
						$t_id_join
						$t_id_where";
		db_query( $query );
 
		$t_id_where = $t_where;
		$t_id_join = $t_join;
		if ( !is_blank( $t_textsearch_wherejoin_clause ) ) {
			$t_id_where = $t_id_where . ' AND ' . $t_textsearch_wherejoin_clause;
			$t_id_join = $t_id_join . " INNER JOIN $t_bugnote_table ON $t_bugnote_table.bug_id = $t_bug_table.id";
			$t_id_join = $t_id_join . " INNER JOIN $t_bugnote_text_table ON $t_bugnote_text_table.id = $t_bugnote_table.bugnote_text_id";
			$query  = "insert into #filtered_ids SELECT DISTINCT $t_bug_table.id as id
						$t_from
						$t_id_join
						$t_id_where";
			db_query( $query );
		}
 
		$query  = "SELECT count(DISTINCT(fid)) as nrows from #filtered_ids";
		$result = db_query( $query );
		$row = db_fetch_array( $result);
		$row_count = $row['nrows'];
 
		$t_from = 'FROM ' . $t_bug_table;
 
		# Get the total number of bugs that meet the criteria.
		$bug_count = $row_count; #count( $t_id_array );

Then at line 853 change the SELECT to :-

	$t_order = " ORDER BY " . implode( ', ', $t_order_array );
	$t_select	= implode( ', ', array_unique( $t_select_clauses ) );
 
        # Modified to optimize it for SQL Server
        # Join to the temp table of ids that we have created.
        $t_join .= " JOIN #filtered_ids on #filtered_ids.fid = $t_bug_table.id ";
	$query2  = "select DISTINCT $t_select
				$t_from
				$t_join
				$t_order";
 
	# Figure out the offset into the db query
	#

How much difference does it make?

I will admit that I haven't made any exact measurements of the difference that this optimization makes with out current load of over 6400 open issues.

What I do know is that when we had about 1000 open issues, the view_all_bug_page was regularly taking over 40s to appear. (Sometimes the SQL caching would work, and it would appear straight away, other times, when a newly added bug changed the text of that in(…) clause the SQL would change, and everything would slow down again…) To add to that the time taken to display the view_all_bug_page seemed to be increasing in a non-linear fashion. The result for the project would have been catastrophic as the number of issues continued to rise.

In addition to the above SQL optimisation, there were some significant benefits to be gained from adding cacheing for custom field values that we have displayed in the view_all_bugs_page.

We now have over 6400 open issues. The view_all_bugs_page never takes more than 5 seconds to appear. Pages other than the first page such as accessed by clicking on the “Next” link can be accessed in under 3 seconds.

mantisbt/database_optimization.1162207684.txt.gz · Last modified: 2008/10/29 04:31 (external edit)

Driven by DokuWiki