NDS WiFi Programming with devkitARM — Part 3

Posted in NintendoDS, sockets on February 18th, 2010 by Chaos Engineer

Finally, after half a year... part 3.

I don't want to go any further without laying out some terminology to use. This project will consist of two parts, an application running on a PC, which I will refer to as the "Interface Host", and an application running on a NintendoDS, which I will refer to as the "Interface Client".

The host application as you may know will be rendering a cube using SDL and OpenGL. It will have a socket listener that will allow the NintendoDS to connect and negotiate terms by which it will send input data to the PC. I wanted to make the project simple but at the same time built with a framework that can be easily extended to do even more interesting things like pipe audio from the DS to the PC or something of the like. In order to do this, we'll define a simple protocol.

The protocol is quite simple. The Host machine will have a TCP listener that will allow Clients to connect and register. Once registered, a Client opens it's own TCP listener that will allow the Host to send commands to it. These listeners are used for signaling, or sending commands back and forth. I will use the terms signal and command interchangeably. After establishing signal listeners on both ends, there is a channel by which communication between the applications is simple. They can use this channel to negotiate terms for a transfer of DS touchpad data, for example.

By having an array of agreed-upon values that correspond to specific signals, and expected data to follow those signals, the applications have the foundation of a protocol. There may be more to the protocol as well, such as an expected order of operations. Again with a focus on simplicity, let us outline a protocol:

Nds Interface Protocol

Nds Interface Protocol

The idea here is that the machines first negotiate with one another, eventually establishing a method to transfer data from the client to the host. All negotiation occurs over TCP, but the actual data transfer will occur over UDP on an agreed-upon port. I won't go into details of TCP vs UDP. There are articles on the net written by people far more qualified than me if you are curious. The basic reasoning is that we need the negotiation to be reliable, but we can tolerate a missing data transfer. Although in theory a series of UDP packets sent over a network can arrive at the destination in any order, I have never seen this happen in practice. Perhaps because all my practice has been over a LAN... maybe things over a WAN would be different. Regardless... UDP works great in practice and provides the benefits of being simpler and carrying less overhead. UDP is generally used for realtime streaming applications where a lost transfer doesn't break the overall functionality of the system. This is exactly the situation for our input data.

The host machine opens a listener on a port the client knows to look for. The client connects to the host machine on this port, and sends the Register signal. The host responds on the same socket with the RegisterAck signal. This acknowledgement signal is followed by a 32-bit integer telling the client what port to open as it's own signal listener. The host then knows how to communicate with it's newly registered "node". After constructing it's own signal listener, the machines can have any number of conversations initiated by either end. Each conversation will typically occur over the same socket connection, and consist of a signal followed an acknowledgement signal w/ a data payload. A simple beowulf cluster could be implemented using this design. Typically deployed over a non-homogenous array of machines, a beowulf host might send a signal to query each node's capability and the node might respond with the time it required to crunch some numbers using it's potentially unique hardware specifications.

The first applications I wrote that implemented this signaling design used multiple threads. The main thread was the signal listener, and when it got incoming connections, it would accept them and then pass the connected socket to a newly spawned thread that would handle the signal from the client. At the time, I thought this was an elite implementation, and without a doubt the best way to accomplish things. I then got myself into situations where I wanted to construct a signal listener either in a project I didn't want to implement threads or on a platform that doesn't support them. Enter the select() system call.

Multiplexing in a single thread, now THAT is elite. Using select(), you can easily monitor a listening socket for new connections while still handling signals from new sockets and even sending data every frame. The innermost loop of our Interface Client application on the DS will do exactly this. It will be managing the signal listener, handling any new signals and sending touchpad coordinates over the established data socket.

Here is what this synchronous code would look like inside an application's main loop, here being part of the Interface Host's main loop:

 
while(!g_bDone) // Pseudo main loop
{
	// Check listener for new signals. This is done in a time-sensitive manner
	//using select to return immediately if there is no data/connections present.
	tvTimeout.tv_sec = 0;
	tvTimeout.tv_usec = 0;
 
	FD_ZERO(&fdsRead);
	FD_SET(sktHost, &fdsRead);
 
	retval = select(FD_SETSIZE, &fdsRead, NULL, NULL, &tvTimeout);
 
	if (retval == -1)
	{
		cerr << __FILE__ << '@' << __LINE__ << " : select() error waiting on accept." << endl;
	}
	else if (retval)
	{
		cout << "Incoming connection..." << endl;
 
		if((sktClient = accept(sktHost, (struct sockaddr*)&adrInterfaceClient, (socklen_t*)&iClientSize)) < 0)
		{
			cerr << __FILE__ << '@' << __LINE__ << " : Failed to accept client connection." << endl;
		}
		else
		{
			strcpy(g_szConnectedClientName, inet_ntoa(adrInterfaceClient.sin_addr));
			cout << "Client connected: " << g_szConnectedClientName << endl;
 
			InterfaceClientSignalHandler(sktClient);
		}
	}
 
	// Do whatever else here... render objects, process input, etc.
}
 

The next article -- part 4 -- will detail the NintendoDS implementation using the dswifi library, and I'll probably write a 5th article just to wrap it all up.

Tags: , , , ,

DERAILED — Platform independent dynamic linking

Posted in Uncategorized on August 9th, 2009 by Chaos Engineer

Ooops, I definitely got sidetracked more mucking around with my platform independent rendering engine. Instead of just holding out on the NDS wifi article in silence for any longer, I thought I would chime in and let you know I haven't forgot about you and show you what I've been up to.

My platform independent rendering engine is based on a pure virtual renderer base class that is filled with an instance of a derived class by a dynamically linked code module. Along with the pure virtual renderer class, there are a few pure virtual resource classes as well, the actual instantiation of which are performed by the renderer class. Trying to maintain platform independence, the singleton (factory / manager) class that returns this renderer instance has some interesting code I thought I would share.

It is possible to have a project directory (even though a bit messy) that will compile in both Microsoft Visual Studio and in GCC without having to change a lick of code. Thats child's play for a simple project, but its still possible even for a project that contains a main executable and several other dynamic libraries.

So first off, since we are dynamically linking, we are definitely going to have function pointers returned by dlsym(...) or GetProcAddress(...). The actual DLL/SO modules export only three functions--one to create the instance, one to destroy it, and one to return the version. So we have the typedefs for these as so:

 
typedef int (*fpCreateRendererInterface)(CRenderer **pInterface);
typedef int (*fpDestroyRendererInterface)(CRenderer **pInterface);
typedef int (*fpQueryRendererVersion)(void);
 

Second off, this is platform independent, so obviously we will need to be using some fancy pre-processor conditionals (PPCs) to detect the platform / compiler. Now in certain instances we only want to know what the compiler is, because the actual platform doesn't matter, but in other cases we want to know the actual platform as well because the same compiler is used on multiple platforms (GCC is used in Linux and also Mac). First for the compiler, we can detect Visual C++ by checking to see if _MSC_VER is defined, and we can detect GCC by checking to see if __GNUC__ is defined. For the platform, we can detect Windows by checking if WIN32 is defined (yes, even 64-bit Windows defines this), we can detect Linux by checking if __linux__ is defined, and we can detect OSX by checking if __MACH__ and __APPLE__ are defined.

We can then use these PPCs to create code that is savagely flexible. In this singleton class, I have a static void pointer to the library (Windows' HMODULE is just a void*, don't let them trick you) called m_hLibrary. Lets see how we could dynamically link either a .so file in Linux or a .dll file in Windows and place the returned handle in the same variable... here is the code for the "SGL" (SDL GL) renderer type which can be used in either Linux or Windows, taking into account if the code is in release or debug mode:

 
case rstSGL:
	Log::Entry(-1,__FILE__,__LINE__,"Creating SGL rendering interface...");
#if defined(__GNUC__)
 
#if defined(NDEBUG)
	Log::Entry(-1,__FILE__,__LINE__,"__GNUC__ defined... opening release library");
	m_hLibrary = dlopen("./libRendererSGL.so",RTLD_NOW);
#else
	Log::Entry(-1,__FILE__,__LINE__,"__GNUC__ defined... opening debug library");
	m_hLibrary = dlopen("../RendererSGL/bin/Debug/libRendererSGL.so",RTLD_NOW);
#endif
	if(!m_hLibrary)
		Log::Entry(2,__FILE__,__LINE__,"Error loading shared library. dlerror() = %s",dlerror());
#elif defined(_MSC_VER)
#if defined(NDEBUG)
	m_hLibrary = LoadLibraryExA("RendererSGL.dll",NULL,NULL);
#else
	m_hLibrary = LoadLibraryExA("../RendererSGL/bin/Debug/RendererSGL.dll",NULL,NULL);
#endif
#endif
	break;
 

Please note that NDEBUG is not defined by GCC even when using -O2, so you need to set your release target to define this manually (-D NDEBUG). I find it strange because it is supposedly a standard...

Also note that when linking using dlopen(...), we specify RTLD_NOW to force resolution of all symbols in the dynamic library. If we used RTLD_LAZY, we could exclude a lot of code from our library, and let it resolve symbols later (i.e. from the main executable). Unfortunately this works great on Linux but seemingly has no analogy in Windows, and in order to keep the projects symmetrical across platforms, we use RTLD_NOW to behave appropriately.

So lets look at whats going on here... its a bit over complicated, but I thought I would provide some extra good ideas for you guys. If using GCC, we use dlopen(...) to get the handle to the .so file, and if using VC++, we use LoadLibraryExA(...) to get the handle to the .dll file. If NDEBUG is defined, we link to the release version of the dynamic library (actually look in the current directory or system path as if it were a proper deployment), otherwise we link to the debug version of the library to simplify development.

Oh snap! So we now have linked to a dynamic library regardless of what platform we are running on! This is kickass, so where to now? We need to get pointers to the functions in the library we are going to use. The function that does this in Linux is dlsym(...), and in windows is GetProcAddress(...). Here is what this would look like in a platform independent form:

 
#if defined(__GNUC__)
	fpCreateRendererInterface IntCreate=(fpCreateRendererInterface)dlsym(m_hLibrary,"CreateInterface");
#elif defined(_MSC_VER)
	fpCreateRendererInterface IntCreate=(fpCreateRendererInterface)GetProcAddress((HMODULE)m_hLibrary,"CreateInterface");
#endif
 

Damn, its that easy? Indeed it is. We now have a function pointer to a procedure in a dynamically linked library regardless of if it came from a .so file in Linux or a .dll file in Windows. Obviously once armed with the function pointer, the remaining code is the same regardless of platform. We just use IntCreate(...) like it were a normal function. In this case we pass a pointer to a pointer to the pure virtual renderer base class, and inside the dynamic library, we assign the pointer to an instance of a derived class:

 
extern "C"
{
	int CreateInterface(CRenderer **pInterface)
	{
		if(*pInterface)
			return -1;
 
		*pInterface = new CRendererSGL();
 
		return 0;
	}
...
 

This is really quite powerful, and since the only actual call to an exported function is when creating the derived class instance, the performance hit while using the derived class is only from the vftable. More importantly this lets us do something ultra simple in our main code to get a reference to an abstract renderer interface that doesn't require knowledge of the nitty-gritty of either D3D or OpenGL:

 
	Log::Entry(0,__FILE__,__LINE__,"Creating rendering device...");
	Graphics::CreateInterface();
	CRenderer *renderer = Graphics::GetInterface();
	renderer->Initialize(0);
 
	if(g_bSafeDevice)
		renderer->CreateDeviceSafe();
	else
		renderer->CreateDevice(800, 600, 32, g_bWindowed);
 

Yeah, I know. That code is delicious. Its even more tasty when you have a CMesh that contains instances of pure virtual CResourceVtxBuff and CResourceIdxBuff created by the derived CRenderer class in the dynamic library, and CModel that contains a number of CMesh instances as well as a number of CMaterial instances that contain instances of pure virtual CResourceTexture also created by the derived CRenderer class in the dynamic library. So to create a model and render it, you would have to do something like the following:

 
	// example of loading a mesh
	mshOut = new CMesh();
	iStride=mshOut->GetVtxStride();
	mshOut->SetVtxCount(iVertexCount);
	mshOut->SetFaceCount(iFaceCount);
 
	m_pRenderer->CreateMeshBuffers(iStride*iVertexCount,sizeof(int)*3*iFaceCount,mshOut);
 
	// if pRenderer is D3D, this lock would be like IDirect3DVertexBuffer9->Lock(...),
	//while in OpenGL it would be like glMapBufferARB(...). Transparent at this point.
	VtxData=(unsigned char*)mshOut->GetVtxPtr()->LockWrite(0,0);
 
	// Write vertex buffer to VtxData here...
 
	mshOut->GetVtxPtr()->Unlock();
 
	IdxData=(unsigned char*)mshOut->GetIdxPtr()->LockWrite(0,0);
 
	// write index buffer to IdxData here..
 
	mshOut->GetIdxPtr()->Unlock();
 
	// example of loading a material
	mtlOut=new CMaterial();
	mtlOut->SetDiffuse(FloatToLongColor(v3fDiffuse.x, v3fDiffuse.y, v3fDiffuse.z));
	mtlOut->SetAmbient(FloatToLongColor(v3fAmbient.x, v3fAmbient.y, v3fAmbient.z));
	mtlOut->SetSpecular(FloatToLongColor(v3fSpecular.x, v3fSpecular.y, v3fSpecular.z));
	mtlOut->SetMaterialName(sMaterialName);
 
	mtlOut->SetTexture(m_pRenderer->CreateTexture(&imgTexture));
 
	// and you add them to a model which contains std::vectors of meshes and materials...
 
	iMaterialIdxList[j] = mdlOut->AddMaterial(mtlOut);
 
	//...
 
	mdlOut->AddMesh(mshOut,iMaterialIdxList[iMaterialRef]);
 
	// and eventually render!
 
	iMeshCount = mdlCurr->GetMeshCount();
 
	for(j=0;j<iMeshCount;j++)
	{
		pRenderer->SetWorld(&matWorld);
 
		k = mdlCurr->GetMeshMaterialIdx(j);
 
		pRenderer->SetTexture(mdlCurr->GetMaterial(k)->GetTexture());
		pRenderer->Render(mdlCurr->GetMesh(j));
	}
 

Anyway, I guess this isn't all that useful, and maybe I'm just showing off at this point. It does exhibit some good ideas, and show how glorious abstraction can be though. Regardless, I got distracted again. The NDS Wifi example continues. I will post in the next few days to describe the protocol used and give a little primer on sockets. Keep your eyes peeled.

Tags: , , ,

Newline neutrality with ifstream

Posted in C++, rendering on June 25th, 2009 by Chaos Engineer

I recently got distracted and started porting my old game engine to use a new platform independent rendering interface I devised. I'm pretty psyched about the rendering system, it fully encapsulates a graphics sdk and all it's associated resources (vertex and index buffers, textures, etc). Deriving from the pure virtual renderer and resource base classes, I can write a new renderer class that targets a specific platform/SDK, plug it in, and just have it work without modifying a single line of code.

Part of this old game engine was a resource management class that could load 3D Studio MAX ASCII exported scenes/objects. These files are .ASE files, which is a versatile and simple format that has a surprisingly wide usage by various game engines. ASE files can be used when building Doom3 and Quake3 / Quake4 levels, since the format boasts native support in ID and Loki software's GtkRadiant design tool, so it is a generally accepted format for storing game content. The ASE format specification itself is quite versatile, and can contain material specifications, verticies, faces (vertex indexes), vertex colors, texture coordinates, and even animations with full interpolation parameters just to name a few. Being ASCII, it is rather easy to parse and modify.

In the ASE format, each line of the file serves a single purpose. It is either an opening or closing tag for a field, or an element in a field. It made sense then when I first wrote the import code to parse the file on a line-by-line basis using ifstream::getline(...). This worked great in Windows, and I had no issues with my implementation. Flash forward to now when I am rewriting this code to work in Linux, but using files created in Windows. I never thought the disparities of a 'newline' between platforms would ever cause such a problem.

The ifstream::getline function was specifically designed to get a single line from a file without including the newline or delimiting character(s). The newline character ('\n') is the delimiting character by default, as it serves its purpose most of the time. ifstream::getline(...) reads in all characters up to the delimiter, then extracts and discards the delimiter. The problem arises when considering the definition of a newline on different platforms.

In Windows, the newline for iostream operations is a carriage-return,line-feed combo. It is actually two characters with ASCII decimal values 13, 10 (0x0D, 0xOA in hex), often referred to as a CRLF combo. In Linux, a newline is simply a line-feed (dec 10, hex 0x0A). So when performing ifstream::getline(...) in Linux, what you are doing is reading a line up to the delimiter--newline by default--and if your file was created in Windows only the last half of the CRLF combo is discarded, while the first half is read in as the last character in your string variable. This makes any literal comparisons made against this string variable fail in Linux while working fine in Windows.

I thought for sure there was some elegant way to overcome this and attain platform independent file parsing without forcing a complete rewrite of the import code. This turned out to not be the case. The ifstream::getline function is handy, but by doing you the favor of automatically extracting and discarding the delimiter, it somewhat cripples it's versatility. My impetus was two fold. Obviously first I wanted to be able to read these ASE files with 0x0D,0x0A line delimiters into Linux without having the 0x0D on the end of my read lines. This would be simple enough, but I wanted also code that would work regardless of platform combinations. This meant reading Windows files in Linux, Linux files in Windows, Windows files in Windows and Linux files in Linux.

So this added a bit of complexity to the requirements. First off, it would be impossible to use ifstream::getline(...) because it would extract and discard different characters from the file stream depending on platform. When reading a Linux file in Windows, it would keep reading from the stream, looking for a CRLF combo, and would likely read the entire file without finding one. When reading a Windows file in Linux as I found out, it would read the line including 0x0D, then stop at and discard the 0x0A. If it didn't consider reading Linux files in Windows, the following implementation was elegant, and worked in all other situations:

 
char szLineBuff[256];
ifstream fsIn("test.ase", ios::in);
 
// Note that failbit would be set if this
//failed to extract any chracters before
//the delimiter, and the loop would end.
while(fsIn.get(szLineBuff, 128, '\n').good())
{
	// This extracts 2 characters in
	//windows, just 1 in Linux.
	fsIn.ignore(2,'\x0A');
	// Cleans Windows line in Linux
	if(szLineBuff[strlen(szLineBuff)-1] == '\x0D')
		szLineBuff[strlen(szLineBuff)-1] = '\0';
 
	// Process szLineBuff...
 
}
 

Now I recognize that opening the file in binary mode (ios::binary) would make the '\n' be interpreted as simply '\x0A' on both Windows and Linux. My problem with this is that the file isn't binary. I'm stubborn and think this ASCII text file should be read in text mode, and that there should be a simple way to interpret newlines from any platform on any other platform. There must be a simple and elegant solution out there somewhere...

I'll be honest and say I stopped there, as my code worked for 3/4 of the scenarios I laid out. I have yet to find a Linux native source of ASE files, so the last scenario (Linux files on Windows) hasn't become important for me yet. If anyone has a better solution for this, or knows one that would work in all scenarios while still being somewhat elegant, I'd love to hear about it.

Also, I know I promised a follow-up on the NDS WiFi programming. I only got a little distracted, the example applications are nearly finished. You'll definitely like it when it comes.

Tags: , , , ,