Jump to content

Replaced Network Library


phit666

Recommended Posts


  • Group:  Members
  • Topic Count:  4
  • Topics Per Day:  0.00
  • Content Count:  11
  • Reputation:   2
  • Joined:  09/01/15
  • Last Seen:  

So I've replaced the standard SELECT library with Libevent (libevent.org) library for obvious reasons below...

1. This will use EPOLL in linux which is a lot faster and efficient than SELECT mechanism, you will notice in socket.c, using SELECT mechanism the do_socket function is doing a lot of LOOOOOPS (sending/receiving/sending/parsing/sending to all active sockets)  whenever SELECT is triggered, note how many "for(i = 1; i < fd_max; i++)" are there and being invoked whenever the server is receiving a few bytes. Using libevent mechanism, I only need to process one socket, only the socket where its receiving the data.

2. 1024 standard socket limit (FD_SETSIZE) of select in linux is enough for unpopulated server, but how about for those populated? EPOLL socket limit is way beyond 1024, its more than 100k sockets, it really depends on the kernel.

Using select mechanism (note how many for loops are here)

int do_sockets(int next)
{
	fd_set rfd;
	struct timeval timeout;
	int ret,i;

	// PRESEND Timers are executed before do_sendrecv and can send packets and/or set sessions to eof.
	// Send remaining data and process client-side disconnects here.
#ifdef SEND_SHORTLIST
	send_shortlist_do_sends();
#else
	for (i = 1; i < fd_max; i++)
	{
		if(!session[i])
			continue;

		if(session[i]->wdata_size)
			session[i]->func_send(i);
	}
#endif

	// can timeout until the next tick
	timeout.tv_sec  = next/1000;
	timeout.tv_usec = next%1000*1000;

	memcpy(&rfd, &readfds, sizeof(rfd));
	ret = sSelect(fd_max, &rfd, NULL, NULL, &timeout);

	if( ret == SOCKET_ERROR )
	{
		if( sErrno != S_EINTR )
		{
			ShowFatalError("do_sockets: select() failed, %s!\n", error_msg());
			exit(EXIT_FAILURE);
		}
		return 0; // interrupted by a signal, just loop and try again
	}

	last_tick = time(NULL);

#if defined(WIN32)
	// on windows, enumerating all members of the fd_set is way faster if we access the internals
	for( i = 0; i < (int)rfd.fd_count; ++i )
	{
		int fd = sock2fd(rfd.fd_array[i]);
		if( session[fd] )
			session[fd]->func_recv(fd);
	}
#else
	// otherwise assume that the fd_set is a bit-array and enumerate it in a standard way
	for( i = 1; ret && i < fd_max; ++i )
	{
		if(sFD_ISSET(i,&rfd) && session[i])
		{
			session[i]->func_recv(i);
			--ret;
		}
	}
#endif

	// POSTSEND Send remaining data and handle eof sessions.
#ifdef SEND_SHORTLIST
	send_shortlist_do_sends();
#else
	for (i = 1; i < fd_max; i++)
	{
		if(!session[i])
			continue;

		if(session[i]->wdata_size)
			session[i]->func_send(i);

		if(session[i]->flag.eof) //func_send can't free a session, this is safe.
		{	//Finally, even if there is no data to parse, connections signalled eof should be closed, so we call parse_func [Skotlex]
			session[i]->func_parse(i); //This should close the session immediately.
		}
	}
#endif

	// parse input data on each socket
	for(i = 1; i < fd_max; i++)
	{
		if(!session[i])
			continue;

		if (session[i]->rdata_tick && DIFF_TICK(last_tick, session[i]->rdata_tick) > stall_time) {
			if( session[i]->flag.server ) {/* server is special */
				if( session[i]->flag.ping != 2 )/* only update if necessary otherwise it'd resend the ping unnecessarily */
					session[i]->flag.ping = 1;
			} else {
				ShowInfo("Session #%d timed out\n", i);
				set_eof(i);
			}
		}

		session[i]->func_parse(i);

		if(!session[i])
			continue;

		// after parse, check client's RFIFO size to know if there is an invalid packet (too big and not parsed)
		if (session[i]->rdata_size == RFIFO_SIZE && session[i]->max_rdata == RFIFO_SIZE) {
			set_eof(i);
			continue;
		}
		RFIFOFLUSH(i);
	}

#ifdef SHOW_SERVER_STATS
	if (last_tick != socket_data_last_tick)
	{
		char buf[1024];
		
		sprintf(buf, "In: %.03f kB/s (%.03f kB/s, Q: %.03f kB) | Out: %.03f kB/s (%.03f kB/s, Q: %.03f kB) | RAM: %.03f MB", socket_data_i/1024., socket_data_ci/1024., socket_data_qi/1024., socket_data_o/1024., socket_data_co/1024., socket_data_qo/1024., malloc_usage()/1024.);
#ifdef _WIN32
		SetConsoleTitle(buf);
#else
		ShowMessage("\033[s\033[1;1H\033[2K%s\033[u", buf);
#endif
		socket_data_last_tick = last_tick;
		socket_data_i = socket_data_ci = 0;
		socket_data_o = socket_data_co = 0;
	}
#endif

	return 0;
}

Now using libevent mechanism

void conn_readcb(struct bufferevent *bev, void *ptr)
{

    int len,i;
	int fd = ptr;

	if( !session_isValid(fd) )
    {
		return;
    }

	len = bufferevent_read(bev,(char *)session[fd]->rdata + session[fd]->rdata_size,(int)RFIFOSPACE(fd));

	if(len == 0)
    {
    	set_eof(fd);
    }

	session[fd]->rdata_size += len;
	session[fd]->rdata_tick = last_tick;

	if(session[fd]->kill_tick > 0)
    	return;

    session[fd]->func_parse(fd);   
	session[fd]->func_send(fd);
    
    if(session[fd]->flag.eof)
	{
		session[fd]->func_parse(fd);
	}
    
	RFIFOFLUSH(fd);
}

 

Now for those owners who got a big number, preferrably more than 1024 players and using linux server, PM me if you would like to test my update.

 

Edited by phit666
Link to comment
Share on other sites

  • 2 weeks later...

  • Group:  Members
  • Topic Count:  54
  • Topics Per Day:  0.01
  • Content Count:  513
  • Reputation:   83
  • Joined:  08/11/12
  • Last Seen:  

can you create a merge request for this and provide the link here? thanks

and also I think I recall @Secrets planning to replace RFIFOs with packet structures with c++. I'm not sure if she's also considering refactoring this part.

Edited by Ninja
Link to comment
Share on other sites


  • Group:  Members
  • Topic Count:  4
  • Topics Per Day:  0.00
  • Content Count:  11
  • Reputation:   2
  • Joined:  09/01/15
  • Last Seen:  

I will soon, Ive also set libevent to use IOCP as backend when compiling to windows, we knew how slow is select in windows so using the IOCP as backend will really add an edge for those windows users.

Link to comment
Share on other sites


  • Group:  Members
  • Topic Count:  54
  • Topics Per Day:  0.01
  • Content Count:  513
  • Reputation:   83
  • Joined:  08/11/12
  • Last Seen:  

On 8/5/2017 at 8:23 AM, phit666 said:

I will soon, Ive also set libevent to use IOCP as backend when compiling to windows, we knew how slow is select in windows so using the IOCP as backend will really add an edge for those windows users.

That's great. But if you can also optimize the Linux part then that would also be nice. Most users prefer Linux as they cost way less than a WindowsVPS. :)

Link to comment
Share on other sites


  • Group:  Members
  • Topic Count:  5
  • Topics Per Day:  0.00
  • Content Count:  149
  • Reputation:   33
  • Joined:  12/24/11
  • Last Seen:  

While I do endorse optimizing in that sort of thing I think we should discuss this. I'm not that deep into Linux Kernel programming, but let me give my 2 cents on this.

(Please correct me if I misunderstand anything)

AFAIK the biggest cost in SELECT comes from checking sockets which have had no activity. This is when epoll exceeds, since it directly connects the wait list to the sockets and only checks sockets with activities. This gains performance for most server applications.

As rAthena is a gaming server, I'm not sure how big the benefit would be. Even idle clients so send/receive ping pongs every 10 seconds. The question is the ratio: How many clients are idle on a typical server, how many clients are active players? 

 

My take on this is the following: I guess epoll will still perform very well since we are talking about ms time intervalls and even busy players won't have more than I'm assuming around 5 packets per second peak. So please go ahead and change to epoll and please share your code, but let's keep this discussion open if we can.

Link to comment
Share on other sites


  • Group:  Members
  • Topic Count:  42
  • Topics Per Day:  0.01
  • Content Count:  1096
  • Reputation:   344
  • Joined:  02/26/12
  • Last Seen:  

Let me say my few cents:

  1. 1024 limit very easy to change via simple kernel setting and apply with sysctl -p, so no problem here at all
  2. touching very well worked mechanisms which working for years without any issues as for me very bad idea
  3. I thought epoll works fine only at linux, is not it? How about windows?
  4. average pps not 5pps for a user, it's much higher and depends on an area and different objects around a player. At WoE pps can burst up to 10k-40k/s easy
  5. are there any benchmarks of received benefits with rathena? Is it possible to somehow to see "before/after"  graphs? Not theory, but real facts required.
  6. if you trying to reduce CPU usage, this is a very bad idea too, because socket handling is not too much CPU heavy thing, a lot of different objects related functions eats usually CPU time like scripts, skills parsing, different checks, different following conditions and so on. So I don't see any logical sense in switching to some mechanism without any advantages. Just for what? Because you can? No, I'm against that, and I hope other people who will read the message will be too, because risks to broke very complex and complicated rathena code too high, and benefits very low. 

 

I do not intend to argue with anyone here or to indicate to someone what he needs to do. But please, before you break the functionality of the emulator, give evidence that your functionality in practice will be really necessary and useful and will not break anything that works right now. I do not want to offend or humiliate anyone, but my experience tells (subjectively) that such ideas only broke the functional part, and did not extol some useful values in the project. I'm worried about the stability of the project, and for its development, and I do not want it to be stalled by some dubious change. I hope you understand my point. :)

 

Link to comment
Share on other sites

  • 1 month later...

  • Group:  Members
  • Topic Count:  16
  • Topics Per Day:  0.00
  • Content Count:  737
  • Reputation:   216
  • Joined:  11/29/11
  • Last Seen:  

@phit666, wanna share ? from my understanding the src/common had evdp_epoll.c which altough isnot libevent look somewhat the same concept.

I would be interesting to see some metric. How much gain is done by this.

In the same veins, I tough we could replace our inter-serv communication with grpc. Which will translate on better compression for packet. currently we have none =(.

  • Upvote 1
Link to comment
Share on other sites

  • 3 years later...

  • Group:  Members
  • Topic Count:  4
  • Topics Per Day:  0.00
  • Content Count:  11
  • Reputation:   2
  • Joined:  09/01/15
  • Last Seen:  

https://github.com/phit666/rathena

Added libevent library for socket, it will use IOCP as backend in windows, currently using SELECT, and EPOLL in linux.

Link to comment
Share on other sites

Join the conversation

You can post now and register later. If you have an account, sign in now to post with your account.

Guest
Reply to this topic...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...