Jul 11, 2011
May 30, 2011
May 10, 2011
Sep 28, 2010
From November 2009 till August 2010 I was developing and launching the emailing system.
The goal of the project is to create a system sending lots of email letters from the large number of server simultaneously. The system is designed to be used by multiple users through a web interface, achieving high delivery rate.
For example, it can be used by the web site owners who want to send newsletters to a large list of users and not willing to invest in their own infrastructure.
For the user the process of sending emails starts consists of registering servers in a system, adding domains, creating campaigns and monitoring the progress.
- Distributing the campaign among multiple working servers with multiple Ips.
- Automatic servers setup and deployment.
- Independence of servers location, minimal requirements for the hardware and connection.
- Almost linear scalability of the system.
- Precise tuning of delivery process, like limiting delivery rate per recipient domain – helps avoiding blocks by yahoo/gmail/etc for flood.
- Working with large lists of recipients - millions of records. Multiple white/black lists in a campaign can be used.
- HTTP API for recepients addition to the lists for integration with other applications.
- Rar and zip compressed lists are accepted for convenience.
- Multiple domains used in a campaign.
- Link masking with multiple user domains.
- Automatic domains management, sub-domains and mx records creation per IP.
- Clicks accounting.
- Bounce and unsubscribe tracking - creating the special black lists taken into account in the future campaigns.
- Web interface for users, aggregate stats and delivery speed graph for campaigns, admin area for user and system management.
- Using tags in templates, like “Hello [UserName]“
- Plain text, HTML or multipart (html+plain text) letters.
- encoding selection - 7bit, quoted-printable or base64.
- Selecting ips and domains for a campaign.
- Detecting offline servers.
There are several interesting aspects in the project implementation.
Internally the system is organized as a set of web services. Each working server runs an MTA (mail transport agent) to send the letters, and a lightweight web server for a web service. The central management server communicates with the workers by HTTP to send letters (instead of SMTP), gather status of delivery, subscription letters, and for the remote configuration management.
When pushing the letters to the working servers to be sent, the template and parameters are sent (addresses, names, links, etc). The individual letters are generates on the worker servers. The data sent to the working servers is encrypted.
This approach has many advantages:
- cuts the traffic
- provides high level of security
- makes it possible to use remote machines and VPSes as remote working servers
- unloads the central server
- makes use of standard widely used trusted technologies (XML, HTTP, OpenSSL, cURL)
- possible traffic compression
The software used is Nginx (web server) and PHP. No database server - just bundled SQLite used to store the queue of the letters to send and logs of the results. It works well – there is no concurrent requests, and
Best of all, the database or a web server requires no human attention.
- a clean CentOS 5 installation
- correctly configured networking
- free 25 port: no Plesk/Qmail/Postfix running
- 256 MB RAM
- date/time/zone and ntp synchronization recommended
The process of automating server setup and remote management came to be much harder then I considered initially. There is several dozens commands one needs to execute and files to upload to prepare a system. Finding out an exact list, exact order and exact file permissions to set wasn't a trivial task.
Moreover, in real world, there are many unexpected problems appearing, like /etc/resolv.conf is not configured, incorrect clock or time zone, port 25 is occupied by some software, autoconf is not installed, and so on.
All you can do is know where the server came from, and check it if you are not sure about it's configuration. After all, there is a detailed log written during a setup process.
When adding a server, user submits the root password which it is not saved anywhere, instead the public key is uploaded, and a key authorization is used to access worker servers further. The server setup routine is implemented as a daemon running on a management server and called by a web script. Most of configuration updates, like rate limit changes, domains management or clearing queues for a campaign canceled by the user, is done trough a web service with encrypted messages.
The system is written mostly in PHP. Version 5.3 has got a good memory management, and daemons are quite stable. I got no stability complains during several months of work and tens of millions of letters sent. At some moment the DB became filled with data from old campaings and required optimization.
The complexity of the forced me to do refactoring of core modules several times, cause new requirements appeared during the work as usually. Everything is written with an OOP design and MVC. It is not a web application at all, also.
Mar 21, 2008
The story started 3 years ago when I published the code for web proxies on the phpclasses.org. The code became popular, I received hundreds of emails.
Then I approached the idea more seriously and launched a web proxy site - browser.grik.net
It worked unexpectedly well - the popularity was going up by 10% per week without my attention until it overloaded the server (pretty weak one, need to mention).
Optimization allowed the site to process 1-2 thousands users per day with 2 GB of daily traffic. In about a year due to personal reasons I turned off the site.
Now I restore the project with much bigger expectations.
The code written will be published as open source, so everyone can download it and set up a good web proxy site.
I will launch at least 10 sites for myself which will point each other, creating a whole network of web proxies.
Later other people willing to join a network will be invited.
The current plan is to reach 100 000 users per day.
First site is launched by an old domain: http://browser.grik.net/
Tanks for attention, I will keep posting about the progress.
Jan 7, 2008
You can get it from my site (gCurl.tar.bz2)
or from PHP Clases when it will get published.
I used it to handle HTTP requests/responses in a number of projects, like "PHP Server-side browser", "Craigslist submitter", in spiders and crawlers, and now it will be a part of my Mirrors project.
Now I updated it for PHP 5.2, commented generously and added a number of samples of usage.
Briefly, the package implements commonly used routines of preparing HTTP requests and parsing server response.
For request it provides means to send cookies, custom headers, GET parameters and POST data.
Response processing includes processing HTTP headers, including multi-line headers, cookies parsing ( representing them as an array convenient to work with ).
The most powerful important feature of the package is an interface that helps assigning handlers for response headers and body.
Assigning a handler for a response body allows processing the data on-the-fly, by chunks, as received from the server.
This way one can process HTTP response body without waiting for all data to be received and not consuming memory for the whole response body.
Most important, one can assign the body handler after processing response headers. This allows assigning a handler depending on Content-Type or cookie.
This allows me to write very flexible scripts dealing with HTTP.
Hope it will help you as well :)
Dec 26, 2007
It tells some nice story about change in the culture and society happened after the fall of USSR.
How did the Richest Russians become Rich