Recommended Steps For New FreeBSD 12.0 Servers

Introduction

When setting up a new FreeBSD server, there are a number of optional steps you can take to get your server into a more production-friendly state. In this guide, we will cover some of the most common examples.

We will set up a simple, easy-to-configure firewall that denies most traffic. We will also make sure that your server’s time zone accurately reflects its location. We will set up NTP polling in order to keep the server’s time accurate and, finally, demonstrate how to add some extra swap space to your server.

Before you get started with this guide, you should log in and configure your shell environment the way you’d like it. You can find out how to do this by following this guide.

How To Configure a Simple IPFW Firewall

The first task is setting up a simple firewall to secure your server.

FreeBSD supports and includes three separate firewalls. These are called pf, ipfw, and ipfilter. In this guide, we will be using ipfw as our firewall. ipfw is a secure, stateful firewall written and maintained as part of FreeBSD.

Configuring the Basic Firewall

Almost all of your configuration will take place in the /etc/rc.conf file. To modify the configuration you’ll use the sysrc command, which allows users to change configuration in /etc/rc.conf in a safe manner. Inside this file you’ll add a number of different lines to enable and control how the ipfw firewall will function. You’ll start with the essential rules; run the following command to begin:

  • sudo sysrc firewall_enable="YES"

Each time you run sysrc to modify your configuration, you’ll receive output showing the changes:

Output
firewall_enable: NO -> YES

As you may expect, this first command enables the ipfw firewall, starting it automatically at boot and allowing it to be started with the usual service commands.

Now run the following:

  • sudo sysrc firewall_quiet="YES"

This tells ipfw not to output anything to standard out when it performs certain actions. This might seem like a matter of preference, but it actually affects the functionality of the firewall.

Two factors combine to make this an important option. The first is that the firewall configuration script is executed in the current shell environment, not as a background task. The second is that when the ipfw command reads a configuration script without the "quiet" flag, it reads and outputs each line, in turn, to standard out. When it outputs a line, it immediately executes the associated action.

Most firewall configuration files flush the current rules at the top of the script in order to start with a clean slate. If the ipfw firewall comes across a line like this without the quiet flag, it will immediately flush all rules and revert to its default policy, which is usually to deny all connections. If you’re configuring the firewall over SSH, this would drop the connection, close the current shell session, and none of the rules that follow would be processed, effectively locking you out of the server. The quiet flag allows the firewall to process the rules as a set instead of implementing each one individually.

After these two lines, you can begin configuring the firewall’s behavior. Now select "workstation" as the type of firewall you’ll configure:

  • sudo sysrc firewall_type="workstation"

This sets the firewall to protect the server from which you’re configuring the firewall using stateful rules. A stateful firewall monitors the state of network connections over time and stores information about these connections in memory for a short time. As a result, not only can rules be defined on what connections the firewall should allow, but a stateful firewall can also use the data it has learned about previous connections to evaluate which connections can be made.

The /etc/rc.conf file also allows you to customize the services you want clients to be able to access by using the firewall_myservices and firewall_allowservices options.

Run the following command to open ports that should be accessible on your server, such as port 22 for your SSH connection and port 80 for a conventional HTTP web server. If you use SSL on your web server, make sure to add port 443:

  • sudo sysrc firewall_myservices="22/tcp 80/tcp 443/tcp"

The firewall_myservices option is set to a list of TCP ports or services, separated by spaces, that should be accessible on your server.

Note: You could also use services by name. The services that FreeBSD knows by name are listed in the /etc/services file. For instance, you could change the previous command to something like this:

  • firewall_myservices="ssh http https"

This would have the same results.

The firewall_allowservices option lists items that should be allowed to access the provided services. Therefore it allows you to limit access to your exposed services (from firewall_myservices) to particular machines or network ranges. For example, this could be useful if you want a machine to host web content for an internal company network. The keyword "any" means that any IPs can access these services, making them completely public:

  • sudo sysrc firewall_allowservices="any"

The firewall_logdeny option tells ipfw to log all connection attempts that are denied to a file located at /var/log/security. Run the following command to set this:

  • sudo sysrc firewall_logdeny="YES"

To check on the changes you’ve made to the firewall configuration, run the following command:

  • grep 'firewall' /etc/rc.conf

This portion of the /etc/rc.conf file will look like this:

Output
firewall_enable="YES" firewall_quiet="YES" firewall_type="workstation" firewall_myservices="22 80 443" firewall_allowservices="any" firewall_logdeny="YES"

Remember to adjust the firewall_myservices option to reference the services you wish to expose to clients.

Allowing UDP Connections (Optional)

The ports and services listed in the firewall_myservices option in the /etc/rc.conf file allow access for TCP connections. If you have services that you wish to expose that use UDP, you need to edit the /etc/rc.firewall file:

  • sudo vi /etc/rc.firewall

You configured your firewall to use the "workstation" firewall type, so look for a section that looks like this:

/etc/rc.firewall
. . .  [Ww][Oo][Rr][Kk][Ss][Tt][Aa][Tt][Ii][Oo][Nn])  . . . 

There is a section within this block that is dedicated to processing the firewall_allowservices and firewall_myservices values that you set. It will look like this:

/etc/rc.firewall
for i in $  {firewall_allowservices} ; do   for j in $  {firewall_myservices} ; do     $  {fwcmd} add pass tcp from $  i to me $  j   done done 

After this section, you can add any services or ports that should accept UDP packets by adding lines like this:

$  {fwcmd} add pass udp from any to me port_num 

In vi, press i to switch to INSERT mode and add your content, then save and close the file by pressing ESC, typing :wq, and pressing ENTER. In the previous example, you can leave the "any" keyword if the connection should be allowed for all clients or change it to a specific IP address or network range. The port_num should be replaced by the port number or service name you wish to allow UDP access to. For example, if you’re running a DNS server, you may wish to have a line that looks something like this:

for i in $  {firewall_allowservices} ; do   for j in $  {firewall_myservices} ; do     $  {fwcmd} add pass tcp from $  i to me $  j   done done  $  {fwcmd} add pass udp from 192.168.2.0/24 to me 53 

This will allow any client from within the 192.168.2.0/24 network range to access a DNS server operating on the standard port 53. Note that in this example you would also want to open this port up for TCP connections as that is used by DNS servers for longer replies.

Save and close the file when you are finished.

Starting the Firewall

When you are finished with your configuration, you can start the firewall by typing:

  • sudo service ipfw start

The firewall will start correctly, blocking unwanted traffic while adhering to your allowed services and ports. This firewall will start automatically at every boot.

You also want to configure a limit on how many denials per IP address you’ll log. This will prevent your logs from filling up from a single, persistent user. You can do this in the /etc/sysctl.conf file:

  • sudo vi /etc/sysctl.conf

At the bottom of the file, you can limit your logging to "5" by adding the following line:

/etc/sysctl.conf
... net.inet.ip.fw.verbose_limit=5 

Save and close the file when you are finished. This will configure that setting on the next boot.

To implement this same behavior for your currently active session without restarting, you can use the sysctl command itself, like this:

  • sudo sysctl net.inet.ip.fw.verbose_limit=5

This should immediately implement the limit for this boot.

How To Set the Time Zone for Your Server

It is a good idea to correctly set the time zone for your server. This is an important step for when you configure NTP time synchronization in the next section.

FreeBSD comes with a menu-based tool called tzsetup for configuring time zones. To set the time zone for your server, call this command with sudo privileges:

  • sudo tzsetup

First, you will be asked to select the region of the world your server is located in:

FreeBSD region of the world

You will need to choose a sub-region or country next:

FreeBSD country

Note: To navigate these menus, you’ll need to use the PAGE UP and PAGE DOWN keys. If you do not have these on your keyboard, you can use FN + DOWN or FN + UP.

Finally, select the specific time zone that is appropriate for your server:

FreeBSD time zone

Confirm the time zone selection that is presented based on your choices.

At this point, your server’s time zone should match the selections you made.

How To Configure NTP to Keep Accurate Time

Now that you have the time zone configured on your server, you can set up NTP, or Network Time Protocol. This will help keep your server’s time in sync with others throughout the world. This is important for time-sensitive client-server interactions as well as accurate logging.

Again, you can enable the NTP service on your server by adjusting the /etc/rc.conf file. Run the following command to add the line ntpd_enable="YES" to the file:

  • sudo sysrc ntpd_enable="YES"

You also need to add a second line that will sync the time on your machine with the remote NTP servers at boot. This is necessary because it allows your server to exceed the normal drift limit on initialization. Your server will likely be outside of the drift limit at boot because your time zone will be applied prior to the NTP daemon starting, which will offset your system time:

  • sudo sysrc ntpd_sync_on_start="YES"

If you did not have this line, your NTP daemon would fail when started due to the timezone settings that skew your system time prior in the boot process.

You can start your ntpd service by typing:

  • sudo service ntpd start

This will maintain your server’s time by synchronizing with the NTP servers listed in /etc/ntp.conf.

How To Configure Extra Swap Space

On FreeBSD servers configured on DigitalOcean, 1 Gigabyte of swap space is automatically configured regardless of the size of your server. You can see this by typing:

  • sudo swapinfo -g

It should show something like this:

Output
Device 1G-blocks Used Avail Capacity /dev/gpt/swapfs 1 0 1 0%

Some users and applications may need more swap space than this. This is accomplished by adding a swap file.

The first thing you need to do is to allocate a chunk of the filesystem for the file you want to use for swap. You’ll use the truncate command, which can quickly allocate space on the fly.

We’ll put the swapfile in /swapfile for this tutorial but you can put the file anywhere you wish, like /var/swapfile for example. This file will provide an additional 1 Gigabyte of swap space. You can adjust this number by modifying the value given to the -s option:

  • sudo truncate -s 1G /swapfile

After you allocate the space, you need to lock down access to the file. Normal users should not have any access to the file:

  • sudo chmod 0600 /swapfile

Next, associate a pseudo-device with your file and configure it to mount at boot by typing:

  • echo "md99 none swap sw,file=/swapfile,late 0 0" | sudo tee -a /etc/fstab

This command adds a line that looks like this to the /etc/fstab file:

md99 none swap sw,file=/swapfile,late 0 0 

After the line is added to your /etc/fstab file, you can activate the swap file for the session by typing:

  • sudo swapon -aqL

You can verify that the swap file is now working by using the swapinfo command again:

  • sudo swapinfo -g

You should see the additional device (/dev/md99) associated with your swap file:

Output
Device 1G-blocks Used Avail Capacity /dev/gpt/swapfs 1 0 1 0% /dev/md99 1 0 1 0% Total 2 0 2 0%

This swap file will be mounted automatically at each boot.

Conclusion

The steps outlined in this guide can be used to bring your FreeBSD server into a more production-ready state. By configuring basic essentials like a firewall, NTP synchronization, and appropriate swap space, your server can be used as a good base for future installations and services.

DigitalOcean Community Tutorials

Understanding Tableau Prep and Conductor: Clean and Join Steps

Understanding Tableau Prep and Conductor: Clean and Join Steps

The last post in this series showed you how to use the Wildcard union and the Manual union to join four years of sales data that was stored in four different worksheets. In this post, you’ll see how the cleaning step can be used to do the following:

  • Change field-data roles
  • Group and replace members of a specific field set
  • Change the data types of a field and rename fields

All this can be done by using the Profile Cards within the Profile Pane in Tableau Prep Builder.

Using the Clean Step in Tableau Prep

The cleaning step behaves a lot like the data interpreter in Tableau Desktop, but it adds additional visualizations in the Profile Pane and the field Profile Cards that help you understand the shape and contents of each field. In addition, clicking on elements within the Profile Cards highlights related details in the other field cards. I find myself using the cleaning step frequently just to examine the results of prior steps.

Adding a Join Step

The Superstore dataset also includes another worksheet containing the names of Regional Managers. Regional Managers are assigned groups of States. We’ll use a join to add that information to the flow:

Now that the sales data from the Superstore tables have been cleaned up, we will bring some public data from the Census Bureau so that we can enhance our sales data by normalizing sales for the population by state.

Adding the Census Data into the Flow

I’ve been using Census data for many years. It’s useful when you want to account for the population density in analyses. In the example we’re building, this data will ultimately be used to express the sales by state in a way that accounts for the population density of each geography.

Using the Pivot Step and Tableau Prep Builder

The Census data isn’t perfectly formatted. We’ll use Prep Builder to fix the structural problems in that dataset:

In the video, I chose to do most of the field clean-up in the pivot step. I could have performed the same cleaning operations in the cleaning step that I added after the pivot. If the work you’re doing is going to be utilized only by use, fewer steps may save you time. If you work with a team of people who are new to Prep Builder, adding more steps to segregate individual cleaning operations may make your flow easier for others to understand. There aren’t hard and fast rules.

This workflow now includes two different data sources and six tables. You’ve seen two different ways to create a union; you’ve seen a join step and a pivot step; and you’ve learned about different ways you can use the cleaning step to improve the formatting and consistency of the data in the workflow. My colleague Katie wrote a blog post that takes a closer look at splitting and pivoting your data, so read it if you need more in-depth insights into those steps. For further information on cleansing your data, look at my colleague Spencer‘s blog on the topic.

In the next post in this series, we’re going to join the Superstore data to the Census data. Because these two data sources are not aggregated in the same way, we’ll be presented with a challenge that we’ll address with an aggregate step.

The post Understanding Tableau Prep and Conductor: Clean and Join Steps appeared first on InterWorks.

InterWorks

Dataquest: How to Learn Python for Data Science In 5 Steps

Why Learn Python For Data Science?

How to Learn Python for Data Science In 5 Steps

Before we explore how to learn Python for data science, we should briefly answer why you should learn Python in the first place.

In short, understanding Python is one of the valuable skills needed for a data science career.

Though it hasn’t always been, Python is the programming language of choice for data science. Here’s a brief history:

  • In 2016, it overtook R on Kaggle, the premier platform for data science competitions.
  • In 2017, it overtook R on KDNuggets’s annual poll of data scientists’ most used tools.
  • In 2018, 66% of data scientists reported using Python daily, making it the number one tool for analytics professionals.

Data science experts expect this trend to continue with increasing development in the Python ecosystem. And while your journey to learn Python programming may be just beginning, it’s nice to know that employment opportunities are abundant (and growing) as well.

According to Indeed, the average salary for a Data Scientist is $ 127,918.

The good news? That number is only expected to increase. The experts at IBM predicted a 28% increase in demand for data scientists by the year 2020.

So, the future is bright for data science, and Python is just one piece of the proverbial pie. Fortunately, learning Python and other programming fundamentals is as attainable as ever. We’ll show you how in five simple steps.

But remember – just because the steps are simple doesn’t mean you won’t have to put in the work. If you apply yourself and dedicate meaningful time to learning Python, you have the potential to not only pick up a new skill, but potentially bring your career to a new level.

How to Learn Python for Data Science

How to Learn Python for Data Science In 5 Steps

First, you’ll want to find the right course to help you learn Python programming. Dataquest’s courses are specifically designed for you to learn Python for data science at your own pace.

In addition to learning Python in a course setting, your journey to becoming a data scientist should also include soft skills. Plus, there are some complimentary technical skills we recommend you learn along the way.

Step 1: Learn Python Fundamentals

Everyone starts somewhere. This first step is where you’ll learn Python programming basics. You’ll also want an introduction to data science.

One of the important tools you should start using early in your journey is Jupyter Notebook, which comes prepackaged with Python libraries to help you learn these two things.

Kickstart your learning by: Joining a community

By joining a community, you’ll put yourself around like-minded people and increase your opportunities for employment. According to the Society for Human Resource Management, employee referrals account for 30% of all hires.

Create a Kaggle account, join a local Meetup group, and participate in Dataquest’s members-only Slack discussions with current students and alums.

Related skills: Try the Command Line Interface

The Command Line Interface (CLI) lets you run scripts more quickly, allowing you to test programs faster and work with more data.

Step 2: Practice Mini Python Projects

We truly believe in hands-on learning. You may be surprised by how soon you’ll be ready to build small Python projects.

Try programming things like calculators for an online game, or a program that fetches the weather from Google in your city. Building mini projects like these will help you learn Python. programming projects like these are standard for all languages, and a great way to solidify your understanding of the basics.

You should start to build your experience with APIs and begin web scraping. Beyond helping you learn Python programming, web scraping will be useful for you in gathering data later.

Kickstart your learning by: Reading

Enhance your coursework and find answers to the Python programming challenges you encounter. Read guidebooks, blog posts, and even other people’s open source code to learn Python and data science best practices – and get new ideas.

Automate The Boring Stuff With Python by Al Sweigart is an excellent and entertaining resource.

Related skills: Work with databases using SQL

SQL is used to talk to databases to alter, edit, and reorganize information. SQL is a staple in the data science community, as 40% of data scientists report consistently using it.*

Step 3: Learn Python Data Science Libraries

Unlike some other programming languages, in Python, there is generally a best way of doing something. The three best and most important Python libraries for data science are NumPy, Pandas, and Matplotlib.

NumPy and Pandas are great for exploring and playing with data. Matplotlib is a data visualization library that makes graphs like you’d find in Excel or Google Sheets.

Kickstart your learning by: Asking questions

You don’t know what you don’t know!

Python has a rich community of experts who are eager to help you learn Python. Resources like Quora, Stack Overflow, and Dataquest’s Slack are full of people excited to share their knowledge and help you learn Python programming. We also have an FAQ for each mission to help with questions you encounter throughout your programming courses with Dataquest.

Related skills: Use Git for version control

Git is a popular tool that helps you keep track of changes made to your code, which makes it much easier to correct mistakes, experiment, and collaborate with others.

Step 4: Build a Data Science Portfolio as you Learn Python

For aspiring data scientists, a portfolio is a must.

These projects should include several different datasets and should leave readers with interesting insights that you’ve gleaned. Your portfolio doesn’t need a particular theme; find datasets that interest you, then come up with a way to put them together.

Displaying projects like these gives fellow data scientists something to collaborate on and shows future employers that you’ve truly taken the time to learn Python and other important programming skills.

One of the nice things about data science is that your portfolio doubles as a resume while highlighting the skills you’ve learned, like Python programming.

Kickstart your learning by: Communicating, collaborating, and focusing on technical competence

During this time, you’ll want to make sure you’re cultivating those soft skills required to work with others, making sure you really understand the inner workings of the tools you’re using.

Related skills: Learn beginner and intermediate statistics

While learning Python for data science, you’ll also want to get a solid background in statistics. Understanding statistics will give you the mindset you need to focus on the right things, so you’ll find valuable insights (and real solutions) rather than just executing code.

Step 5: Apply Advanced Data Science Techniques

Finally, aim to sharpen your skills. Your data science journey will be full of constant learning, but there are advanced courses you can complete to ensure you’ve covered all the bases.

You’ll want to be comfortable with regression, classification, and k-means clustering models. You can also step into machine learning – bootstrapping models and creating neural networks using scikit-learn.

At this point, programming projects can include creating models using live data feeds. Machine learning models of this kind adjust their predictions over time.

Remember to: Keep learning!

Data science is an ever-growing field that spans numerous industries.

At the rate that demand is increasing, there are exponential opportunities to learn. Continue reading, collaborating, and conversing with others, and you’re sure to maintain interest and a competitive edge over time.

How Long Will It Take To Learn Python?

After reading these steps, the most common question we have people ask us is: “How long does all this take?”

There are a lot of estimates for the time it takes to learn Python. For data science specifically, estimates a range from 3 months to a year of consistent practice.

We’ve watched people move through our courses at lightning speed and others who have taken it much slower.

Really, it all depends on your desired timeline, free time that you can dedicate to learn Python programming and the pace at which you learn.

Dataquest’s courses are created for you to go at your own speed. Each path is full of missions, hands-on learning and opportunities to ask questions so that you get can an in-depth mastery of data science fundamentals.

Get started for free. Learn Python with our Data Scientist path and start mastering a new skill today.

Resources and studies cited:

Planet Python

Synchronize Axes Across Multiple Sheets in Five Simple Steps

reference line calculated field in Tableau

I often find myself building dashboards with a parameter to compare baseline data to different scenarios. Because these scenarios are built out in different worksheets, I am faced with the challenge of correctly displaying changes in results compared to each other. In the mocked-up example below, you can see how the axes change dynamically within each worksheet, yet the bars stay the same height, making it difficult to quickly see the difference.

Note: This example uses Global Superstore training data.

Incorrect GIF

If you hunt through the Tableau forums, you’ll find that the ability to synchronize axes across worksheets is not built into the software. The official suggestion from Tableau is to manually fix the axis height to the same value across all worksheets.

This works, except it is not very future-proof if the data ever changes or updates … and when do we ever work with purely static data? Thanks to my colleague Carl Slifer, we have a better solution—and you will love the simplicity of it.

Step-by-Step Guide to Synchronizing Axes in Tableau

Step 1: Create the first worksheet with your baseline data (I simply use total sales):

baseline scenario in Tableau

Step 2: Create a worksheet to represent your scenario. Here, I use a simple parameter to change sales +/- a factor of 10%:

edit parameter in Tableau

calculated field in Tableau

Hint: Once the parameter and calculated field are created, show the parameter and drag the calculated field to the Rows shelf to create a mirror image of the sales.

Step 3: Now, here comes the cool part. We need to add a reference line to each worksheet. We also need the reference line to be calculated by whichever value is larger: sales or the scenario-adjusted sales. We do this by creating a calculated field with a simple formula using MAX. The use of MAX is important because we want to stretch the axis of other worksheets relative to the value of the largest value being compared:

MAX calculated field in Tableau

This will re-position the reference line based on which value is largest between the two worksheets.

Step 4: Place the Reference Line calculated field on the Details tile of the Marks card for each worksheet. Next, format the reference line so it does not show a value or line:

reference line calculated field in Tableau

Step 5: Lastly, build out your dashboard, adjust the parameter control, and be dazzled as your bar graphs re-size relative to each other!

The last piece to keep in mind is to ensure you align your graphs appropriately on the dashboard. Consider putting both graphs in a container and checking that axis font sizes, titles, etc. are the same for each worksheet as these formatting pieces could also skew how the graphs visually align:

correct GIF

Synchronize Axes with a MIN Function

BONUS! Have negative values in your data? You can set up your axis to display negative values as well. Starting at Step 3, use the MIN function instead, and add a second reference line to your graphs based on the minimum value. This will give your graphs two reference lines that will respond dynamically to your data.

The post Synchronize Axes Across Multiple Sheets in Five Simple Steps appeared first on InterWorks.

InterWorks