If you like DNray Forum, you can support it by - BTC: bc1qppjcl3c2cyjazy6lepmrv3fh6ke9mxs7zpfky0 , TRC20 and more...

 

Server that can poll and store data from 15,000 sensors

Started by kyouxngofi, Apr 18, 2023, 06:06 AM

Previous topic - Next topic

kyouxngofiTopic starter

My current project involves developing the solar energy control system, and I am personally responsible for creating the brains of this system from scratch. While the prototype is able to work autonomously, I need to develop a server that will display all the data from these brains on a web interface.
As I am also learning Debian in parallel with development, I prefer not to use third-party hosting.

The ultimate goal is to have 1500 houses equipped with this technology, wherein each house's brain would transmit sensor data such as temperature, voltage, current, illumination, etc. to a central computer. I need a database to store and retrieve all the pertinent data and to monitor the status of each brain through the web interface.

What hardware requirements do I need for this task? And how are such systems generally built?
Is there any resource or guide available that can help me understand how everything works?
  •  


doro

Collecting statistics can be done with any hardware, and you don't necessarily need a dedicated IP on the server or use MySQL (dynamic DNS is also an option).

As for the monitoring and display, any language that allows for programming can be used. You can put together a Bootstrap + PHP interface in a matter of hours. All in all, the task is straightforward.
  •  

john121

Based on your comment, it seems like you prefer starting from the beginning. In that case, you may require a library such as Boost Asio or any other comparable ones.

You can find numerous articles on Habr regarding this topic, like Asio or libevent. Additionally, connecting to a database is not difficult, as there are multiple C++ bindings available, including MySQL.

However, if dealing with "low-level" sockets, multithreading, and debugging seems daunting, then you could opt for Erlang, which will solve your problem perfectly.
  •  

addisoncave

Check out graphite.wikidot.com. It offers a user-friendly database and an easy-to-use API for adding data. It also includes a web interface, numerous aggregate functions, scalability, and compatibility with various systems and applications.

The platform enables plotting of any data value, and includes bulk data entry functionality for any time interval. Installation and mastery of the system are straightforward; it can accommodate small installations monitoring just a few values, as well as large-scale installations like google's, which collects parameters from tens of thousands of objects.
  •  

sick9sut

Your hardware requirements will depend on a few factors: the volume of data being generated, the frequency of data transmission, and the processing power needed to analyze and display that data in real-time. But let's break it down step by step.

1. Central Server Requirements:
CPU: Given that you're planning to handle data from 1500 houses, you'll need a pretty robust CPU. At a minimum, I would suggest something like a quad-core processor, preferably with multi-threading capabilities. As the number of connected houses increases, you may want to scale up to more cores or even consider a server with multiple CPUs.
RAM: For handling and processing real-time data, 16GB of RAM should be the minimum. If your data processing becomes more complex, you might want to consider 32GB or more.
Storage: With sensor data being transmitted constantly, you'll need substantial storage. SSDs are a must for fast read/write speeds. Depending on how long you plan to store historical data, you might start with at least 1TB and scale up from there.
Network: A good network interface card (NIC) is crucial, especially if you're dealing with real-time data. A gigabit Ethernet connection should be a starting point. You'll also need a reliable and possibly redundant internet connection to ensure that data from all the houses is continuously streamed without interruption.

2. Database Selection:
You'll need a database that can handle time-series data effectively, since you're dealing with sensor readings over time. InfluxDB is a popular choice for time-series databases. It's designed specifically for scenarios like yours where you're storing and querying large amounts of timestamped data.
Alternatively, you could consider using PostgreSQL with the TimescaleDB extension, which adds time-series functionality to PostgreSQL. This would give you the flexibility of a full-featured SQL database with the ability to efficiently handle time-series data.

3. Data Transmission:
For communication between the brains in each house and the central server, you might consider using MQTT (Message Queuing Telemetry Transport). It's a lightweight messaging protocol that's designed for IoT applications and works well in environments with limited bandwidth.
Each brain would publish sensor data to an MQTT broker (which could be running on your central server), and your server could subscribe to these topics to receive and process the data in real-time.

4. Web Interface:
You'll need to create a web interface that allows you to monitor all 1500 houses. For this, you could use Node.js with Express.js for the backend, handling data from the database and passing it to the frontend.
For the frontend, consider using a framework like React or Vue.js to build a dynamic and responsive interface. You can visualize the data using libraries like D3.js or Chart.js to create real-time graphs and dashboards.

5. Security:
With data being transmitted from potentially thousands of houses, security is paramount. Make sure to implement SSL/TLS for data transmission to encrypt the data in transit. You should also consider using VPNs or other methods to secure the connections between the brains and the central server.
It's also important to regularly update and patch your server to protect against vulnerabilities.

6. Learning Resources:
For Debian, the official documentation and forums are incredibly helpful. Since you're learning it in parallel, focus on understanding server administration, networking, and security aspects.
For database management, you can start with the documentation of InfluxDB or PostgreSQL, depending on which you choose. There are also plenty of tutorials on setting up and optimizing these databases.

For the web interface, if you're new to Node.js, consider the book "Node.js Design Patterns" by Mario Casciaro, which provides a solid foundation. For frontend development, there are many free resources on MDN Web Docs and YouTube tutorials for React or Vue.js.
Building such a system requires careful planning and consideration of scalability and security. Make sure to prototype your setup with a smaller number of houses first to identify any potential bottlenecks or issues before scaling up. And remember, the hardware requirements can always be adjusted as you gain more insight into the actual performance and data load of the system.
  •  

Nidhibng

A server capable of polling and storing data from 15,000 sensors requires high processing power, efficient data handling, and robust storage capacity. It must support real-time data acquisition, handle large data volumes, and ensure reliable performance. The server should have scalable architecture, advanced networking capabilities, and strong data integrity measures. Proper load balancing and fault tolerance are essential to manage sensor data effectively. Additionally, the server must be optimized for rapid data retrieval and analysis, supporting various applications in monitoring, automation, and IoT environments.
  •  

ZokEntinnyhok

I'd recommend a quad-core processor, at least 8GB of RAM, and a 1TB SSD for storing your database. For the database itself, consider using a relational database management system like PostgreSQL or MySQL. As for the web interface, you can use a Python-based framework like Flask or Django to create a RESTful API that interacts with your database.

To handle the influx of data from 1500 houses, you'll need to implement a data ingestion pipeline that can handle high volumes of data. Consider using a message broker like Apache Kafka or RabbitMQ to handle the data stream. For monitoring and analytics, you can use tools like Grafana and Prometheus.

For resources, check out the official documentation for each technology, as well as online courses on platforms like Udemy and Coursera. As for guides, I'd recommend "Full Stack Development with Python" by Shalabh Aggarwal and "Database Systems: The Complete Book" by Hector Garcia-Molina.
  •  


dexcowork

Experience seamless data management with a server that polls and stores data from 15,000 sensors. With real-time data acquisition and scalable storage, it ensures reliable monitoring.Put your trust in technology that handles data well.
  •  

Zinavopvtltd

To handle polling and storing data from 15,000 sensors, choose a high-performance server with scalable cloud infrastructure, robust databases (e.g., PostgreSQL), and optimized data streaming protocols like MQTT.
  •  


If you like DNray forum, you can support it by - BTC: bc1qppjcl3c2cyjazy6lepmrv3fh6ke9mxs7zpfky0 , TRC20 and more...