Files @ r13257:4c5b8120be59
Branch filter:

Location: cpp/openttd-patchpack/source/src/core/alloc_func.hpp

rubidium
(svn r17776) -Codechange: [SDL] make "update the video card"-process asynchronious. Profiling with gprof etc. hasn't shown us that DrawSurfaceToScreen takes a significant amount of CPU; only using TIC/TOC it became apparant that it was a heavy CPU-cycle user or that it was waiting for something.
The benefit of making this function asynchronious ranges from 2%-25% (real time) during fast forward on dual core/hyperthreading-enabled CPUs; 8bpp improvements are, in my test cases, significantly smaller than 32bpp improvements.
On single core non-hyperthreading-enabled CPUs the extra locking/scheduling costs up to 1% extra realtime in fast forward. You can use -v sdl:no_threads to disable threading and undo this loss.
During normal non-fast-forwarded games the benefit/costs are negligable except when the gameloop takes more than about 90% of the time of a tick.
Note that allegro's performance does not improve with this system, likely due to their way of getting data to the video card. It is not implemented for the OS X/Windows video backends, unless (ofcourse) SDL is used there.
Funny is that the performance of the 32bpp(-anim) blitter is, at least in some test cases, significantly faster (more than 10%) than the 8bpp(-optimized) blitter when looking at real time in fast forward on a dual core CPU; it was slower.
The idea comes from a paper/report by Idar Borlaug and Knut Imar Hagen.
/* $Id$ */

/*
 * This file is part of OpenTTD.
 * OpenTTD is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, version 2.
 * OpenTTD is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
 * See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with OpenTTD. If not, see <http://www.gnu.org/licenses/>.
 */

/** @file alloc_func.hpp Functions related to the allocation of memory */

#ifndef ALLOC_FUNC_HPP
#define ALLOC_FUNC_HPP

/**
 * Functions to exit badly with an error message.
 * It has to be linked so the error messages are not
 * duplicated in each object file making the final
 * binary needlessly large.
 */
void NORETURN MallocError(size_t size);
void NORETURN ReallocError(size_t size);

/**
 * Simplified allocation function that allocates the specified number of
 * elements of the given type. It also explicitly casts it to the requested
 * type.
 * @note throws an error when there is no memory anymore.
 * @note the memory contains garbage data (i.e. possibly non-zero values).
 * @tparam T the type of the variable(s) to allocation.
 * @param num_elements the number of elements to allocate of the given type.
 * @return NULL when num_elements == 0, non-NULL otherwise.
 */
template <typename T>
static FORCEINLINE T *MallocT(size_t num_elements)
{
	/*
	 * MorphOS cannot handle 0 elements allocations, or rather that always
	 * returns NULL. So we do that for *all* allocations, thus causing it
	 * to behave the same on all OSes.
	 */
	if (num_elements == 0) return NULL;

	T *t_ptr = (T*)malloc(num_elements * sizeof(T));
	if (t_ptr == NULL) MallocError(num_elements * sizeof(T));
	return t_ptr;
}

/**
 * Simplified allocation function that allocates the specified number of
 * elements of the given type. It also explicitly casts it to the requested
 * type.
 * @note throws an error when there is no memory anymore.
 * @note the memory contains all zero values.
 * @tparam T the type of the variable(s) to allocation.
 * @param num_elements the number of elements to allocate of the given type.
 * @return NULL when num_elements == 0, non-NULL otherwise.
 */
template <typename T>
static FORCEINLINE T *CallocT(size_t num_elements)
{
	/*
	 * MorphOS cannot handle 0 elements allocations, or rather that always
	 * returns NULL. So we do that for *all* allocations, thus causing it
	 * to behave the same on all OSes.
	 */
	if (num_elements == 0) return NULL;

	T *t_ptr = (T*)calloc(num_elements, sizeof(T));
	if (t_ptr == NULL) MallocError(num_elements * sizeof(T));
	return t_ptr;
}

/**
 * Simplified reallocation function that allocates the specified number of
 * elements of the given type. It also explicitly casts it to the requested
 * type. It extends/shrinks the memory allocation given in t_ptr.
 * @note throws an error when there is no memory anymore.
 * @note the pointer to the data may change, but the data will remain valid.
 * @tparam T the type of the variable(s) to allocation.
 * @param t_ptr the previous allocation to extend/shrink.
 * @param num_elements the number of elements to allocate of the given type.
 * @return NULL when num_elements == 0, non-NULL otherwise.
 */
template <typename T>
static FORCEINLINE T *ReallocT(T *t_ptr, size_t num_elements)
{
	/*
	 * MorphOS cannot handle 0 elements allocations, or rather that always
	 * returns NULL. So we do that for *all* allocations, thus causing it
	 * to behave the same on all OSes.
	 */
	if (num_elements == 0) {
		free(t_ptr);
		return NULL;
	}

	t_ptr = (T*)realloc(t_ptr, num_elements * sizeof(T));
	if (t_ptr == NULL) ReallocError(num_elements * sizeof(T));
	return t_ptr;
}

/** alloca() has to be called in the parent function, so define AllocaM() as a macro */
#define AllocaM(T, num_elements) ((T*)alloca((num_elements) * sizeof(T)))

#endif /* ALLOC_FUNC_HPP */